Project

General

Profile

Feature #19315

Updated by Eregon (Benoit Daloze) over 1 year ago

CRuby should implement lazy substrings, i.e., "abcdef"[1..3] must not copy bytes. 

 Currently CRuby only reuse the char* if the substring is until the end of the buffer. 
 But it should also work wherever the substring starts and ends. 
 Yes, it means RSTRING_PTR() might need to allocate to \0-terminate, so be it, it's worth it. 

 There is already code for this (`SHARABLE_MIDDLE_SUBSTRING`), but it's disabled by default and `RSTRING_PTR()` needs to be changed to deal with this. 
 It seems a good idea to introduce a variant of `RSTRING_PTR` which doesn't guarantee \0-termination, so such callers can then use the existing bytes always without copy. 

 There are countless workarounds for this missing optimization, all not worth and all less readable: 
 * https://bugs.ruby-lang.org/issues/19314 
 * https://bugs.ruby-lang.org/issues/18598#note-3 
 * https://github.com/ruby/net-protocol/pull/14 
 * Manual lazy substrings which track string + index + length 
 * More but I don't remember all now, feel free to comment or link more urls/tickets.

Back