Ruby Issue Tracking System: Issueshttps://redmine.ruby-lang.org/https://redmine.ruby-lang.org/favicon.ico?17113305112022-06-16T04:01:34ZRuby Issue Tracking System
Redmine Ruby master - Bug #18833 (Rejected): Documentation for IO#gets is inaccurate (bytes versus charac...https://redmine.ruby-lang.org/issues/188332022-06-16T04:01:34Zadh1003 (Andrew Hodgkinson)ahodgkin@rowing.org.uk
<p>Please see <a href="https://ruby-doc.org/core-3.1.2/IO.html#method-i-gets" class="external">https://ruby-doc.org/core-3.1.2/IO.html#method-i-gets</a>:</p>
<blockquote>
<p>With integer argument <code>limit</code> given, returns up to <code>limit+1</code> bytes:</p>
</blockquote>
<p>In relation to <a href="https://github.com/janko/down/pull/74" class="external">https://github.com/janko/down/pull/74</a>, I discovered that while <code>IO#read</code> ignores the stream's specified encoding if asked to read a specific number of bytes and does then do exactly that - reads the requested number of 8-bit bytes - <code>IO#gets</code> respects the encoding if given a <code>limit</code> and the <strong>number provided is characters, not bytes</strong>. This means that not only might more actual bytes be read from the file (advancing its file pointer accordingly) due to things like a BOM, more bytes might also be read for multi-byte encoding. Moreover, the number of bytes in the returned data can exceed the number passed to the method (because it's a number of characters, contrary to the documentation) and it won't necessarily include some bytes from the very start of the file (a UTF-8 BOM is stripped, for example). <code>IO#gets</code> <em>does</em> correctly handle a multibyte character being split at the limit of the requested read position if taken as bytes and continues reading more bytes until it has read the requested number of complete characters.</p>
<p>(It is in fact clearly unavoidable that it works in an encoding-aware fashion, else it would be unable to accurately interpret the <code>sep</code> parameter. Coercing everything down to a pure 8-bit byte stream and trying to dumb-match the stream that way would risk mismatching a separator byte stream within the wider file byte stream at a non-character boundary).</p>
<p>This is causing confusion for people implementing IO subclasses or IO-like classes and I'm sure you recognise that it is of critical importance that the distinction between bytes and characters is made accurately, especially in such a crucial low-level piece of documentation as IO.</p>
<p>If you wish, I can have a go at figuring out a PR for it (not really done that ouside of GitHub before, so something of a learning curve!).</p> Ruby master - Feature #16276 (Open): For consideration: "private do...end" / "protected do...end"https://redmine.ruby-lang.org/issues/162762019-10-23T19:49:23Zadh1003 (Andrew Hodgkinson)ahodgkin@rowing.org.uk
<p>Private or protected declarations in Ruby classes are problematic. The single, standalone <code>public</code>, <code>private</code> or <code>protected</code> statements cause all following methods - <em>except</em> "private" class methods, notably - to have that protection level. It is not idiomatic in Ruby to indent method definitions after such declarations, so it becomes at a glance very hard to see what a method's protection level is when just diving into a piece of source code. One must carefully scroll <em>up</em> the code searching for a relevant declaration (easily missed, when everything's at the same indentation level) or have an IDE sufficiently advanced to give you that information automatically (and none of the lightweight editors I prefer personally have yet to support this). Forcibly indenting code after declarations helps, but most Ruby developers find this unfamiliar and most auto-formatters/linters will reset it or, at best, complain. Further, the difficulty in defining private <em>class</em> methods or constants tells us that perhaps there's more we should do here - but of course, we want to maintain backwards compatibility.</p>
<p>On the face of it, I can't see much in the way of allowing the <code>public</code>, <code>private</code> or <code>protected</code> declarations to - <em>optionally</em> - support a block-like syntax.</p>
<pre><code>class Foo
# ...there may be prior old-school public/private/protected declarations...
def method_at_whatever_traditional_ruby_protection_level_applies
puts "I'm traditional"
end
private do
def some_private_instance_method
puts "I'm private"
end
def self.some_private_class_method
puts "I'm also private - principle of least surprise"
end
NO_NEED_FOR_PRIVATE_CONSTANT_DECLARATIONS_EITHER = "private"
end
def another_method_at_whatever_traditional_ruby_protection_level_applies
puts "I'm also traditional"
end
end
</code></pre>
<p>My suggestion here confines all <code>public do...end</code>, <code>protected do...end</code> or <code>private do...end</code> protections strictly to the confines of the block alone. Outside the block - both before and after - traditional Ruby protection semantics apply, allowing one to add new block-based protection-enclosed method declarations inside any existing code base without fear of accidentally changing the protection level of any methods defined below the new block. As noted in the pseudocode above, we can clean up some of the issues around the special syntax needed for "private constants", too.</p>
<p>I see a lot of wins in here but I'm aware I may be naïve - for example, arising unanswered questions include:</p>
<ul>
<li>Is the use of a block-like syntax making unwarranted assumptions about what the Ruby compiler can do during its various parsing phases?</li>
<li>Does the use of a block-like syntax imply we should support things like Procs too? (I <em>think</em> probably not - I see this as just syntax sugar to provide a new feature reusing a familiar idiom but without diving down any other rabbit holes, at least not in the first implementation)</li>
</ul>
<p>I've no idea how one would go about implementing this inside Ruby Core, as I've never tackled that before. If someone is keen to pick up the feature, great! Alternatively, if a rough idea of how it <em>might</em> be implemented could be sketched out, then I might be able to have a go at implementation myself and submit a PR - assuming anyone is keen on the idea in the first place <code>:-)</code></p> Ruby master - Bug #16165 (Closed): Endless ranges have inconsistency between #cover? and #include?https://redmine.ruby-lang.org/issues/161652019-09-11T21:59:54Zadh1003 (Andrew Hodgkinson)ahodgkin@rowing.org.uk
<p>In an endless Range, I'd expect to be able to use <code>#include?</code> just as I do with a Range that has an end value. It would amount to just a check on whether the argument was greater than or equal to the start value of that Range (and likewise, if "startless" ranges are eventually supported, it'd just be a check on the Range's end value, or if it were possibly to have both a startless <em>and</em> endless Range, always return <code>true</code>).</p>
<p>In Ruby 2.6.4, behaviour is unexpected. I "need to know" if a Range is endless and implement the check myself, because using <code>#include?</code> results in <code>RangeError (cannot get the last element of endless range)</code>. The reason this feels like a bug, apart from being merely unhelpful, is because <code>#cover?</code> accepts a single value rather than a Range too - and <em>this</em> behaves exactly as we might expect. I can ask if <code>(1..).cover?(0)</code> and get false, or <code>(1..).cover?(2000)</code> and get <code>true</code>, just like I'd expect.</p>
<p>It's weird that <code>#cover?</code> works, but <code>#include?</code> throws an unusual exception. In my particular use case I have an array of Ranges for various periods of time that typically terminates in an endless Range for "everything afterwards". Being able to write clear, simple code that treats each of these the same to see if a given discrete time falls into a particular time "bucket" is obviously valuable. It'd be ugly to have to check each time to see if a given Range was endless and treat it specially, just because <code>#include?</code> would otherwise raise an exception. In this case, at least, <code>#cover?</code> comes to the rescue :-)</p>
<p>If there's a general agreement that this should be a thing, I could look into producing a patch to have <code>#include?</code> be a bit better behaved when dealing with endless ranges.</p> Ruby master - Bug #11657 (Closed): Abort Trap 6 when running a test suitehttps://redmine.ruby-lang.org/issues/116572015-11-05T03:02:49Zadh1003 (Andrew Hodgkinson)ahodgkin@rowing.org.uk
<p>An internal Ruby gem I develop for my company has a test suite that works fine on Ruby 2.1.x but crashes on <strong>2.2.3 and 2.3.0-dev</strong> with:</p>
<pre><code>[BUG] Stack consistency error (sp: 273, bp: 271)
</code></pre>
<p>I've tried this on both OS X (10.11.1) and a Debian build in a Virtualbox VM to try and eliminate OS X as the problem, with the same results (as in, an abort and a 'stack consistency error' in the logs). I have attached the backtrace log data from both the OS X and Debian builds, from Ruby 2.2.3p173 (though as I say, I did try 2.3.0-dev too and the same stack error arose).</p>
<p>At present, the component in question is closed source. We are actually planning to open source it, but it'll be a while. I'm unable to replicate this as some isolated test case at present I'm afraid - it seems quite a lot of "stuff" needs to happen before it dies.</p>