Ruby Issue Tracking System: Issues
https://redmine.ruby-lang.org/
https://redmine.ruby-lang.org/favicon.ico?1711330511
2024-01-08T13:58:54Z
Ruby Issue Tracking System
Redmine
Ruby master - Feature #20160 (Rejected): rescue keyword for case expressions
https://redmine.ruby-lang.org/issues/20160
2024-01-08T13:58:54Z
lloeki (Loic Nageleisen)
<p>It is frequent to find this piece of hypothetical Ruby code:</p>
<pre><code> case (parsed = parse(input))
when Integer then handle_int(parsed)
when Float then handle_float(parsed)
end
</code></pre>
<p>What if we need to handle <code>parse</code> raising a hypothetical <code>ParseError</code>? Currently this can be done in two ways.</p>
<p>Either option A, wrapping <code>case .. end</code>:</p>
<pre><code> begin
case (parsed = parse(input))
when Integer then handle_int(parsed)
when Float then handle_float(parsed)
# ...
end
rescue ParseError
# ...
end
</code></pre>
<p>Or option B, guarding before <code>case</code>:</p>
<pre><code> begin
parsed = parse(input)
rescue ParseError
# ...
end
case parsed
when Integer then handle_int(parsed)
when Float then handle_float(parsed)
# ...
end
</code></pre>
<p>The difference between option A and option B is that:</p>
<ul>
<li>option A <code>rescue</code> is not localised to parsing and also covers code following <code>when</code> (including calling <code>===</code>), <code>then</code>, and <code>else</code>, which may or may not be what one wants.</li>
<li>option B <code>rescue</code> is localised to parsing but moves the definition of the variable (<code>parsed</code>) and the call to what is actually done (<code>parse(input)</code>) far away from <code>case</code>.</li>
</ul>
<p>With option B in some cases the variable needs to be introduced even though it might not be needed in <code>then</code> parts (e.g if the call in <code>case</code> is side-effectful or its value simply leading to branching decision logic).</p>
<p>The difference becomes important when rescued exceptions are more general (e.g <code>Errno</code> stuff, <code>ArgumentError</code>, etc..), as well as when we consider <code>ensure</code> and <code>else</code>. I feel like option B is the most sensible one in general, but it adds a lot of noise and splits the logic in two parts.</p>
<p>I would like to suggest a new syntax:</p>
<pre><code> case (parsed = parse(input))
when Integer then handle_int(parsed)
when Float then handle_float(parsed)
rescue ParseError
# ...
rescue ArgumentError
# ...
else
# ... fallthrough for all rescue and when cases
ensure
# ... called always
end
</code></pre>
<p>If more readability is needed as to what these <code>rescue</code> are aimed to handle - being more explicit that this is option B - one could optionally write like this:</p>
<pre><code> case (parsed = parse(input))
rescue ParseError
# ...
rescue ArgumentError
# ...
when Integer then handle_int(parsed)
when Float then handle_float(parsed)
...
else
# ...
ensure
# ...
end
</code></pre>
<p>Keyword <code>ensure</code> could also be used without <code>rescue</code> in assignment contexts:</p>
<pre><code>foo = case bar.perform
when A then 1
when B then 2
ensure bar.done!
end
</code></pre>
<p>Examples:</p>
<ul>
<li>A made-up pubsub streaming parser with internal state, abstracting away reading from source:</li>
</ul>
<pre><code>parser = Parser.new(io)
loop do
case parser.parse # blocks for reading io in chunks
rescue StandardError => e
if parser.can_recover?(e)
# tolerate failure, ignore
next
else
emit_fail(e)
break
end
when :integer
emit_integer(parser.last)
when :float
emit_float(parser.last)
when :done
# e.g EOF reached, IO closed, YAML --- end of doc, XML top-level closed, whatever makes sense
emit_done
break
else
parser.rollback # e.g rewinds io, we may not have enough data
ensure
parser.checkpoint # e.g saves io position for rollback
end
end
</code></pre>
<ul>
<li>Network handling, extrapolated from <a href="https://ruby-doc.org/stdlib-2.7.1/libdoc/net/http/rdoc/Net/HTTP.html#class-Net::HTTP-label-Following+Redirection" class="external">ruby docs</a>:</li>
</ul>
<pre><code>case (response = Net::HTTP.get_response(URI(uri_str))
rescue URI::InvalidURIError
# handle URI errors
rescue SocketError
# handle socket errors
rescue
# other general errors
when Net::HTTPSuccess
response
when Net::HTTPRedirection then
location = response['location']
warn "redirected to #{location}"
fetch(location, limit - 1)
else
response.value
ensure
@counter += 1
end
</code></pre>
<p>Credit: the idea initially came to me from <a href="https://inside.java/2023/12/15/switch-case-effect/" class="external">this article</a>, and thinking how it could apply to Ruby.</p>
Ruby master - Bug #13806 (Closed): StringIO encoding conversion
https://redmine.ruby-lang.org/issues/13806
2017-08-11T13:12:39Z
lloeki (Loic Nageleisen)
<p>StringIO's doc page says:</p>
<blockquote>
<p>Pseudo I/O on String object.</p>
<p>Commonly used to simulate <code>$stdio</code> or <code>$stderr</code></p>
</blockquote>
<p>As it turns out, this is precisely my use case, as I was writing some tests that boiled down to something like a (highly simplified) this:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">s</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span>
<span class="n">stuff</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">s</span><span class="p">).</span><span class="nf">do_something</span>
<span class="n">assert_equal</span> <span class="s2">"foo"</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="nf">tap</span><span class="p">(</span><span class="o">&</span><span class="ss">:rewind</span><span class="p">).</span><span class="nf">read</span>
</code></pre>
<p>The result of which was in my case:</p>
<pre><code class="diff syntaxhl" data-language="diff"><span class="gd">--- expected
</span><span class="gi">+++ actual
</span><span class="p">@@ -1,2 +1,2 @@</span>
<span class="gd">-"foo"
</span><span class="gi">+# encoding: ASCII-8BIT
+""
</span></code></pre>
<p>Indeed I had a bug so my test was supposed to fail ("foo" vs "") but what caught my eye was the encoding issue.</p>
<p>So I did some comparison tests, and behaviours differ significantly:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">f</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="no">File</span><span class="o">::</span><span class="no">CREAT</span> <span class="o">|</span> <span class="no">File</span><span class="o">::</span><span class="no">RDWR</span><span class="p">)</span>
<span class="n">f</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span> <span class="c1"># => 3</span>
<span class="n">f</span><span class="p">.</span><span class="nf">rewind</span> <span class="c1"># => 0</span>
<span class="n">f</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">f</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => nil</span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:UTF-8></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:UTF-8></span>
<span class="n">s</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span> <span class="c1"># => #<StringIO:0x007f879e9e54d0></span>
<span class="n">s</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">s</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:UTF-8></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:ASCII-8BIT></span>
</code></pre>
<p>There's that subtle little issue at EOF. So, what about "w+"?:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">f</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"w+"</span><span class="p">)</span> <span class="c1"># => #<File:foo></span>
<span class="n">f</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span> <span class="c1"># => 3</span>
<span class="n">f</span><span class="p">.</span><span class="nf">rewind</span> <span class="c1"># => 0</span>
<span class="n">f</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">f</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => nil</span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:UTF-8></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:UTF-8></span>
<span class="n">s</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"w+"</span><span class="p">)</span> <span class="c1"># => #<StringIO:0x007f879e81f268></span>
<span class="n">s</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">s</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:ASCII-8BIT></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:ASCII-8BIT></span>
</code></pre>
<p>Somehow it makes StringIO always behave as binary on #read. Hmmm.</p>
<p>Let's try binary. IO's doc says:</p>
<blockquote>
<p>"b" Binary file mode<br>
Suppresses EOL <-> CRLF conversion on Windows. And<br>
sets external encoding to ASCII-8BIT unless explicitly<br>
specified.</p>
</blockquote>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">f</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"w+b"</span><span class="p">)</span> <span class="c1"># => #<File:foo></span>
<span class="n">f</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span> <span class="c1"># => 3</span>
<span class="n">f</span><span class="p">.</span><span class="nf">rewind</span> <span class="c1"># => 0</span>
<span class="n">f</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">f</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:ASCII-8BIT></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:ASCII-8BIT></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:ASCII-8BIT></span>
<span class="n">s</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"w+b"</span><span class="p">)</span> <span class="c1"># => #<StringIO:0x007f879f0bd460></span>
<span class="n">s</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">s</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:ASCII-8BIT></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:ASCII-8BIT></span>
</code></pre>
<p>Close, but no cigar: external_encoding is still incorrect, and #read could care less. Let's try making things explicit:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">f</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"w+b:ASCII-8BIT:ASCII-8BIT"</span><span class="p">)</span>
<span class="n">f</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span> <span class="c1"># => 3</span>
<span class="n">f</span><span class="p">.</span><span class="nf">rewind</span> <span class="c1"># => 0</span>
<span class="n">f</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">f</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:UTF-8></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:UTF-8></span>
<span class="n">s</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">""</span><span class="p">,</span> <span class="s2">"w+b:ASCII-8BIT:ASCII-8BIT"</span><span class="p">)</span>
<span class="n">s</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">s</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:ASCII-8BIT></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:ASCII-8BIT></span>
</code></pre>
<p>Nope, external_encoding still wrong. Anyway, in my case I was looking for UTF-8, so what about that?</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">f</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"w+b:UTF-8:UTF-8"</span><span class="p">)</span> <span class="c1"># => #<File:foo></span>
<span class="n">f</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span> <span class="c1"># => 3</span>
<span class="n">f</span><span class="p">.</span><span class="nf">rewind</span> <span class="c1"># => 0</span>
<span class="n">f</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">f</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:UTF-8></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:UTF-8></span>
<span class="n">s</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">""</span><span class="p">,</span> <span class="s2">"w+b:UTF-8:UTF-8"</span><span class="p">)</span> <span class="c1"># => #<StringIO:0x007fd531cb9248></span>
<span class="n">s</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">s</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:ASCII-8BIT></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:ASCII-8BIT></span>
</code></pre>
<p>StringIO keeps insisting on its binary output irrespective of the mode argument as described in the doc. Last resort, forcing text mode:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">f</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"w+t:UTF-8:UTF-8"</span><span class="p">)</span> <span class="c1"># => #<File:foo></span>
<span class="n">f</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span> <span class="c1"># => 3</span>
<span class="n">f</span><span class="p">.</span><span class="nf">rewind</span> <span class="c1"># => 0</span>
<span class="n">f</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">f</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:UTF-8></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:UTF-8></span>
<span class="n">s</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">""</span><span class="p">,</span> <span class="s2">"w+t:UTF-8:UTF-8"</span><span class="p">)</span> <span class="c1"># => #<StringIO:0x007f879f04fc08></span>
<span class="n">s</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">s</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:ASCII-8BIT></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:ASCII-8BIT></span>
</code></pre>
<p>Same. Anyway, one last time, let's go nuts:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">f</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"w+:UTF-16:UTF-32"</span><span class="p">)</span> <span class="c1"># => #<File:foo></span>
<span class="n">f</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span> <span class="c1"># => 3</span>
<span class="n">f</span><span class="p">.</span><span class="nf">rewind</span> <span class="c1"># => 0</span>
<span class="n">f</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => #<Encoding:UTF-32 (dummy)></span>
<span class="n">f</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-16 (dummy)></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:UTF-32 (dummy)></span>
<span class="n">f</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:UTF-32 (dummy)></span>
<span class="n">s</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">""</span><span class="p">,</span> <span class="s2">"w+:UTF-16:UTF-32"</span><span class="p">)</span> <span class="c1"># => #<StringIO:0x007f879f04fc08></span>
<span class="n">s</span><span class="p">.</span><span class="nf">internal_encoding</span> <span class="c1"># => nil</span>
<span class="n">s</span><span class="p">.</span><span class="nf">external_encoding</span> <span class="c1"># => #<Encoding:UTF-8></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "foo" # => #<Encoding:ASCII-8BIT></span>
<span class="n">s</span><span class="p">.</span><span class="nf">read</span><span class="p">.</span><span class="nf">encoding</span> <span class="c1"># reads "" at EOF # => #<Encoding:ASCII-8BIT></span>
</code></pre>
<p>I think the result speaks for itself.</p>
<p>In my specific case I quickly found workarounds, but this makes for brittle code ant tests. Sometimes this involves faking StringIO with an actual temp file, which is, let's say, sub par.</p>
<p>Tangentially related: StringIO is missing quite some methods compared to IO, either sometimes forcing code to be aware of it, which is IMHO not good, (e.g breaking code coverage in tests), requiring monkeypatching StringIO, or making creative (ahem) use of temp files and thus hitting the filesystem.</p>
<p>Seems tied to old-ish: <a href="https://bugs.ruby-lang.org/issues/7964" class="external">https://bugs.ruby-lang.org/issues/7964</a></p>