Actions
Feature #20576
closedAdd MatchData#bytebegin and MatchData#byteend
Description
I'd like to propose MatchData#bytebegin and MatchData#byteend.
These methods are similar to MatchData#begin and MatchData#end, but returns offsets in bytes instead of codepoints.
Pull request: https://github.com/ruby/ruby/pull/10973
One of the use cases is scanning strings: https://github.com/ruby/net-imap/pull/286/files
MatchData#byteend is faster than MatchData#byteoffset because there is no need to allocate an Array.
Here's a benchmark result:
voyager:ruby$ cat b.rb
require "benchmark"
require "strscan"
text = "あ" * 100000
Benchmark.bmbm do |b|
b.report("byteoffset(0)[1]") do
pos = 0
while text.byteindex(/\G./, pos)
pos = $~.byteoffset(0)[1]
end
end
b.report("byteend(0)") do
pos = 0
while text.byteindex(/\G./, pos)
pos = $~.byteend(0)
end
end
end
voyager:ruby$ ./tool/runruby.rb b.rb
Rehearsal ----------------------------------------------------
byteoffset(0)[1] 0.020558 0.000393 0.020951 ( 0.020963)
byteend(0) 0.018149 0.000000 0.018149 ( 0.018151)
------------------------------------------- total: 0.039100sec
user system total real
byteoffset(0)[1] 0.020821 0.000000 0.020821 ( 0.020822)
byteend(0) 0.017455 0.000000 0.017455 ( 0.017455)
Actions
Like0
Like0Like0Like0Like0Like0Like0