Feature #13890
openAllow a regexp as an argument to 'count', to count more interesting things than single characters
Description
Currently, String#count only accepts strings, and counts all the characters in the string.
However, I have repeatedly met the situation where I wanted to count more interesting things in strings.
These 'interesting things' can easily be expressed with regular expressions.
Here is a quick-and-dirty Ruby-level implementation:
class String
alias old_count count
def count (what)
case what
when String
old_count what
when Regexp
pos = -1
count = 0
count += 1 while pos = index(what, pos+1)
count
end
end
end
Please note that the implementation counts overlapping occurrences; maybe there is room for an option like overlap: :no
.
Updated by Eregon (Benoit Daloze) over 7 years ago
Should it behave the same as str.scan(regexp).size ?
I think the default should be no overlap, and increment the position by the length of the match.
Updated by duerst (Martin Dürst) over 7 years ago
Eregon (Benoit Daloze) wrote:
I think the default should be no overlap, and increment the position by the length of the match.
That would be fine by me, too.
Updated by duerst (Martin Dürst) about 7 years ago
Python allows to count strings, as follows:
str.count(sub[, start[, end]])
Return the number of non-overlapping occurrences of substring
sub
in the range [start, end]
. Optional arguments start
and end
are interpreted as in slice notation.
Updated by duerst (Martin Dürst) about 6 years ago
- Related to Feature #12698: Method to delete a substring by regex match added
Updated by shan (Shannon Skipper) about 2 years ago
I'd love to have this feature. A str.count(regexp)
is something I see folk trying fairly often. A str.count(regexp)
also avoids the intermediary Array of str.scan(regexp).size
or the back bending with str.enum_for(:scan, regexp).count
.
Updated by matz (Yukihiro Matsumoto) about 2 years ago
If str.count(re)
works as str.scan(re).size
(besides efficiency), it's acceptable. But if someone needs overlapping, they needs to explain their use-case.
Matz.
Updated by sawa (Tsuyoshi Sawada) about 2 years ago
Overlapping can be realized by putting the original regexp within a look-ahead.
s = "abcdefghij"
re = /.{3}/
Non-overlapping count:
s.scan(re).count # => 3
s.count(re) # => Expect 3
Overlapping count:
s.scan(/(?=#{re})/).count # => 8
s.count(/(?=#{re})/) # => Expect 8
So I do not think there is any need to particularly implement overlapping as a feature of this method.