Project

General

Profile

Actions

Bug #14500

closed

Missing Regexp documentation and clarification on behavior of \K for edge case

Added by Sundeep (Sundeep Agarwal) about 6 years ago. Updated over 2 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:85707]

Description

Capturing section (https://ruby-doc.org/core-2.5.0/Regexp.html#class-Regexp-label-Capturing)

Anchors section (https://ruby-doc.org/core-2.5.0/Regexp.html#class-Regexp-label-Anchors)

  • suggestion to add documentation on \K
  • need clarification if the below behavior seen is expected and mention it while adding documentation
$ echo 'aaa' | ruby -pe 'gsub(/a\K/, ":")'
a:aa:

$ # what I expected
$ echo 'aaa' | ruby -pe 'gsub(/(a)/, "\\1:")'
a:a:a:

Updated by znz (Kazuhiro NISHIYAMA) about 6 years ago

Sundeep (Sundeep Agarwal) wrote:

  • suggestion to add that numbered capturing groups is limited to 9

I don't think so.
Why did you think so?

irb -r irb/completion --simple-prompt
>> /()()()()()()()()()(a)/ =~ "a"
=> 0
>> $10
=> "a"

Updated by Sundeep (Sundeep Agarwal) about 6 years ago

oh, I didn't check with $10. I'd tried with backreference. Any idea how to use \10?

$ echo 'abcdefghij' | ruby -pe 'sub(/(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)/, ":\\9:")'
:i:
$ echo 'abcdefghij' | ruby -pe 'sub(/(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)/, ":\\10:")'
:a0:
$ echo 'abcdefghij' | ruby -pe 'sub(/(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)/){":#{$10}:"}'
:j:
Actions #3

Updated by hsbt (Hiroshi SHIBATA) over 2 years ago

  • Tracker changed from Misc to Bug
  • Backport set to 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

Sundeep (Sundeep Agarwal) wrote:

Anchors section (https://ruby-doc.org/core-2.5.0/Regexp.html#class-Regexp-label-Anchors)

  • suggestion to add documentation on \K
  • need clarification if the below behavior seen is expected and mention it while adding documentation
$ echo 'aaa' | ruby -pe 'gsub(/a\K/, ":")'
a:aa:

$ # what I expected
$ echo 'aaa' | ruby -pe 'gsub(/(a)/, "\\1:")'
a:a:a:

The \K behavior at the end of a regexp is a bug that has already been filed upstream in Onigmo: https://github.com/k-takata/Onigmo/issues/152

I agree about the other documentation issues, and I'll push a commit shortly to address them.

Actions #5

Updated by jeremyevans (Jeremy Evans) over 2 years ago

  • Status changed from Open to Closed

Applied in changeset git|4fc9ddd7b6af54abf88d702c2e11e97ca7750ce3.


Update Capturing and Anchors sections of regexp documention

Document that only first 9 numbered capture groups can use the \n
backreference syntax. Document \0 backreference. Document \K anchor.

Fixes [Bug #14500]

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0