Feature #6802
open
String#scan should have equivalent yielding MatchData
Added by prijutme4ty (Ilya Vorontsov) over 12 years ago.
Updated almost 7 years ago.
Description
Ruby should have method to obtain not an array of arrays but of MatchData objects. It can help in obtaining named groups:
pattern = /x: (?\d+) y:(?\d+)/
polygon = []
text.scan_for_pattern(pattern){|m| polygon << Point.new(m[:x], m[:y]) }
Not to break existing code we need unique name. Ideas? May be #each_match
Simple implementation:
class String
def each_match(pattern, &block)
return Enumerator.new(self, :each_match, pattern) unless block_given?
text = self
m = text.match(pattern)
while m
yield m
text = text[m.end(0)..-1]
m = text.match(pattern)
end
end
end
=begin
You can use (({String#scan})) with the block form and (({$~})) (as well as other Regexp-related globals) for this:
> text="x:1 y:12 ; x:33 y:2"
> text.scan(/x:(?<x>\d+) y:(?<y>\d+)/) { p [$~[:x],$~[:y]] }
["1", "12"]
["33", "2"]
Please check your Regexp and give an example of (({text})) next time.
=end
Thank you for a solution! I always forgot about regexp global vars. Though I suggest that using a special method here is more clear. So what'd you say about String#each_match and Regexp#each_match
Yes, implementation is as simple as
class String
def each_match(pat)
scan(pat){ yield $~ }
end
end
and similar for Regexp.
Eregon (Benoit Daloze) wrote:
=begin
You can use (({String#scan})) with the block form and (({$~})) (as well as other Regexp-related globals) for this:
> text="x:1 y:12 ; x:33 y:2"
> text.scan(/x:(?<x>\d+) y:(?<y>\d+)/) { p [$~[:x],$~[:y]] }
["1", "12"]
["33", "2"]
Please check your Regexp and give an example of (({text})) next time.
=end
+1 I have definitely used this before (as Facets' #mscan).
prijutme4ty (Ilya Vorontsov) wrote:
Though I suggest that using a special method here is more clear.
So what'd you say about String#each_match and Regexp#each_match
I did indeed somewhat expected String#scan to yield a MatchData object, instead of $~.captures.
I'm in favor of String#each_match, it might be a nice addition and the name is clear, but the naming is different from the usual regexp methods on String, and it might not be worth to add a method (I agree $~ is not the prettiest thing around).
I think Regexp#each_match does not convey well what it does though.
+1 to have a method to return MatchData.
This is related to (or duplicate of) #5749 and #5606.
Even with the simple implementation I think to establish a standard
name and specification.
- Status changed from Open to Assigned
- Assignee set to matz (Yukihiro Matsumoto)
- Target version set to 2.6
- Target version deleted (
2.6)
- Related to Feature #12745: String#(g)sub(!) should pass a MatchData to the block, not a String added
Also available in: Atom
PDF
Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0