https://redmine.ruby-lang.org/
https://redmine.ruby-lang.org/favicon.ico?1711330511
2015-08-02T19:27:37Z
Ruby Issue Tracking System
Ruby master - Bug #11410: Win32 Registry enumeration performs unnecessary string re-encoding which cause UndefinedConversionError exceptions
https://redmine.ruby-lang.org/issues/11410?journal_id=53639
2015-08-02T19:27:37Z
Iristyle (Ethan Brown)
ethan_j_brown@hotmail.com
<ul></ul><p>I realized that I should have included some sample code demonstrating the problem:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="nb">require</span> <span class="s1">'win32/registry'</span>
<span class="no">ENDASH_UTF_16</span> <span class="o">=</span> <span class="p">[</span><span class="mh">0x2013</span><span class="p">]</span>
<span class="no">TM_UTF_16</span> <span class="o">=</span> <span class="p">[</span><span class="mh">0x2122</span><span class="p">]</span>
<span class="n">endash_utf_16_str</span> <span class="o">=</span> <span class="no">ENDASH_UTF_16</span><span class="p">.</span><span class="nf">pack</span><span class="p">(</span><span class="s1">'s*'</span><span class="p">).</span><span class="nf">force_encoding</span><span class="p">(</span><span class="no">Encoding</span><span class="o">::</span><span class="no">UTF_16LE</span><span class="p">)</span>
<span class="n">tm_utf_16_str</span> <span class="o">=</span> <span class="no">TM_UTF_16</span><span class="p">.</span><span class="nf">pack</span><span class="p">(</span><span class="s1">'s*'</span><span class="p">).</span><span class="nf">force_encoding</span><span class="p">(</span><span class="no">Encoding</span><span class="o">::</span><span class="no">UTF_16LE</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">test_with_encoding</span><span class="p">(</span><span class="n">root</span><span class="p">,</span> <span class="n">key_name</span><span class="p">,</span> <span class="n">encoding</span><span class="p">)</span>
<span class="no">Encoding</span><span class="o">::</span><span class="n">default_internal</span> <span class="o">=</span> <span class="n">encoding</span>
<span class="nb">puts</span> <span class="s2">"</span><span class="se">\n\n</span><span class="s2">Testing with </span><span class="si">#{</span><span class="n">encoding</span><span class="p">.</span><span class="nf">to_s</span><span class="si">}</span><span class="s2">"</span>
<span class="nb">puts</span> <span class="s2">"- Reading value </span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">parent</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\\</span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\\</span><span class="si">#{</span><span class="n">key_name</span><span class="si">}</span><span class="s2">"</span>
<span class="k">begin</span>
<span class="n">value</span> <span class="o">=</span> <span class="n">root</span><span class="p">[</span><span class="n">key_name</span><span class="p">]</span>
<span class="nb">puts</span> <span class="s2">" - read value </span><span class="si">#{</span><span class="n">key_name</span><span class="si">}</span><span class="s2"> as </span><span class="si">#{</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">rescue</span> <span class="no">Exception</span> <span class="o">=></span> <span class="n">e</span>
<span class="nb">puts</span> <span class="s2">" x failed to read from </span><span class="si">#{</span><span class="n">key_name</span><span class="si">}</span><span class="se">\n\t\t</span><span class="si">#{</span><span class="n">e</span><span class="si">}</span><span class="se">\n</span><span class="s2">"</span>
<span class="k">end</span>
<span class="nb">puts</span> <span class="s2">" - Reading value </span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">parent</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\\</span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\\</span><span class="si">#{</span><span class="n">key_name</span><span class="si">}</span><span class="s2">"</span>
<span class="k">begin</span>
<span class="n">type</span><span class="p">,</span> <span class="n">value</span> <span class="o">=</span> <span class="n">root</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="n">key_name</span><span class="p">)</span>
<span class="nb">puts</span> <span class="s2">" - read value </span><span class="si">#{</span><span class="n">key_name</span><span class="si">}</span><span class="s2"> as type: </span><span class="si">#{</span><span class="n">type</span><span class="si">}</span><span class="s2">, value: </span><span class="si">#{</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">rescue</span> <span class="no">Exception</span> <span class="o">=></span> <span class="n">e</span>
<span class="nb">puts</span> <span class="s2">" x failed to read from </span><span class="si">#{</span><span class="n">key_name</span><span class="si">}</span><span class="se">\n\t\t</span><span class="si">#{</span><span class="n">e</span><span class="si">}</span><span class="se">\n</span><span class="s2">"</span>
<span class="k">end</span>
<span class="nb">puts</span> <span class="s2">" - Enumerating Keys for </span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">parent</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\\</span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="s2">"</span>
<span class="k">begin</span>
<span class="n">root</span><span class="p">.</span><span class="nf">each_key</span> <span class="k">do</span> <span class="o">|</span><span class="n">key</span><span class="p">,</span> <span class="n">wtime</span><span class="o">|</span>
<span class="nb">puts</span> <span class="s2">" - read each_key </span><span class="si">#{</span><span class="n">key</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">rescue</span> <span class="no">Exception</span> <span class="o">=></span> <span class="n">e</span>
<span class="nb">puts</span> <span class="s2">" x failed to each_key from </span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">parent</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\\</span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\n\t\t</span><span class="si">#{</span><span class="n">e</span><span class="si">}</span><span class="se">\n</span><span class="s2">"</span>
<span class="k">end</span>
<span class="nb">puts</span> <span class="s2">" - Enumerating Values for </span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">parent</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\\</span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="s2">"</span>
<span class="k">begin</span>
<span class="n">root</span><span class="p">.</span><span class="nf">each_value</span> <span class="k">do</span> <span class="o">|</span><span class="nb">name</span><span class="p">,</span> <span class="n">type</span><span class="p">,</span> <span class="n">value</span><span class="o">|</span>
<span class="nb">puts</span> <span class="s2">" - read each_value </span><span class="si">#{</span><span class="nb">name</span><span class="si">}</span><span class="s2"> as type: </span><span class="si">#{</span><span class="n">type</span><span class="si">}</span><span class="s2">, value: </span><span class="si">#{</span><span class="n">value</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">rescue</span> <span class="no">Exception</span> <span class="o">=></span> <span class="n">e</span>
<span class="nb">puts</span> <span class="s2">" x failed to each_value from </span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">parent</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\\</span><span class="si">#{</span><span class="n">root</span><span class="p">.</span><span class="nf">keyname</span><span class="si">}</span><span class="se">\n\t\t</span><span class="si">#{</span><span class="n">e</span><span class="si">}</span><span class="se">\n</span><span class="s2">"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">root</span> <span class="o">=</span> <span class="no">Win32</span><span class="o">::</span><span class="no">Registry</span><span class="o">::</span><span class="no">HKEY_CURRENT_USER</span>
<span class="n">root</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span><span class="s1">'SOFTWARE\rubyfail'</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">reg</span><span class="o">|</span>
<span class="c1"># create subkey with trademark symbol</span>
<span class="n">reg</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span><span class="n">tm_utf_16_str</span><span class="p">)</span>
<span class="c1"># create endash value named foo</span>
<span class="n">reg</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s1">'foo'</span><span class="p">,</span> <span class="no">Win32</span><span class="o">::</span><span class="no">Registry</span><span class="o">::</span><span class="no">REG_SZ</span><span class="p">,</span> <span class="n">endash_utf_16_str</span><span class="p">)</span>
<span class="n">test_with_encoding</span><span class="p">(</span><span class="n">reg</span><span class="p">,</span> <span class="s1">'foo'</span><span class="p">,</span> <span class="no">Encoding</span><span class="o">::</span><span class="no">WINDOWS_1252</span><span class="p">)</span>
<span class="c1"># failures with both enumeration of keys and values</span>
<span class="n">test_with_encoding</span><span class="p">(</span><span class="n">reg</span><span class="p">,</span> <span class="s1">'foo'</span><span class="p">,</span> <span class="no">Encoding</span><span class="o">::</span><span class="no">IBM437</span><span class="p">)</span>
<span class="k">end</span>
</code></pre>
<p>The important part is that you will failures in calling <code>each_key</code> and <code>each_value</code> when either contains characters that cannot be converted to the current codepage.</p>
Ruby master - Bug #11410: Win32 Registry enumeration performs unnecessary string re-encoding which cause UndefinedConversionError exceptions
https://redmine.ruby-lang.org/issues/11410?journal_id=53649
2015-08-03T01:49:50Z
nobu (Nobuyoshi Nakada)
nobu@ruby-lang.org
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Feedback</i></li></ul><p>I agree that unnecessary conversions should be removed, but your code won't work yet, since the results will be expected in the locale encoding.</p>
<p>What do you want?</p>
<ol>
<li>it's OK</li>
<li>return everything in UTF-8</li>
<li>add optional parameter to specify the result encoding</li>
<li>or others</li>
</ol>
Ruby master - Bug #11410: Win32 Registry enumeration performs unnecessary string re-encoding which cause UndefinedConversionError exceptions
https://redmine.ruby-lang.org/issues/11410?journal_id=53651
2015-08-03T02:08:19Z
nobu (Nobuyoshi Nakada)
nobu@ruby-lang.org
<ul></ul><p><a href="https://github.com/nobu/ruby/tree/bug/11410-win32-registry-encoding" class="external">https://github.com/nobu/ruby/tree/bug/11410-win32-registry-encoding</a></p>
Ruby master - Bug #11410: Win32 Registry enumeration performs unnecessary string re-encoding which cause UndefinedConversionError exceptions
https://redmine.ruby-lang.org/issues/11410?journal_id=53771
2015-08-13T06:57:28Z
Iristyle (Ethan Brown)
ethan_j_brown@hotmail.com
<ul></ul><p>I think the best solution here is to use UTF-8 strings wherever possible. If a program needs to use locale, then let the program decide to do that. I don't think Ruby should be making encoding decisions for a user like this, given Ruby is using wide character APIs and UTF-16LE strings.</p>
<p>While your proposed solution should work, I don't think the burden should be put on the calling code to always set an encoding value everywhere <code>#each_value</code> or <code>#each_key</code> is used, to prevent an exception. Keep in mind that <code>#keys</code> and <code>#values</code> call <code>#each_keys</code> and <code>#each_value</code>, but with your solution, there is no way to override the encoding when using those methods.</p>