Project

General

Profile

Actions

Bug #13352

closed

RegExp infinite looping

Added by jumnichi_kose (小瀬 淳一) about 7 years ago. Updated about 7 years ago.

Status:
Rejected
Assignee:
-
Target version:
-
ruby -v:
2.3.3p222 (2016-11-21 revision 56859) [x64-mingw32], ruby 2.2.4p230 (2015-12-16 revision 53155) [i386-mingw32]
[ruby-dev:50029]

Description

ruby entered an infinite loop with the following three steps.

r = /(?=\W)\W?(?:(?:(?:\/\*<)?[\'\"]?([\w_\$]+)[\'\"]?(?:>\*\/\s+this)?|(this))\.(\$scope|json)|([\'\"](?:\$scope|json)[\'\"]))((\.)(inputObjects)|(\.)(spreadSheets)(?:\.[\'\"]?([^\.\'\"\|\s,]+)[\'\"]?|\[[\'\"]?((?:[^\'\"\]\[]*|\[[^\]]*\])*)[\'\"]?\])(\.)(model))\s*(?:\.[\'\"]?([^\.\'\"\|\s,]+)[\'\"]?|\[[\'\"]?((?:[^\'\"\]\[]*|\[[^\]]*\])*)[\'\"]?\])/
txt = ".model |,| 'anController'.$scope.inputObjects['fieldName .../* !! comment ...*/"
txt.scan r

Node.js also falls into an infinite loop.

var r = /(?=\W)\W?(?:(?:(?:\/\*<)?[\'\"]?([\w_\$]+)[\'\"]?(?:>\*\/\s+this)?|(this))\.(\$scope|json)|([\'\"](?:\$scope|json)[\'\"]))((\.)(inputObjects)|(\.)(spreadSheets)(?:\.[\'\"]?([^\.\'\"\|\s,]+)[\'\"]?|\[[\'\"]?((?:[^\'\"\]\[]*|\[[^\]]*\])*)[\'\"]?\])(\.)(model))\s*(?:\.[\'\"]?([^\.\'\"\|\s,]+)[\'\"]?|\[[\'\"]?((?:[^\'\"\]\[]*|\[[^\]]*\])*)[\'\"]?\])/
var txt = ".model |,| 'anController'.$scope.inputObjects['fieldName .../* !! comment ...*/"

var s =r.exec (txt)

console.log(s)


Updated by jumnichi_kose (小瀬 淳一) about 7 years ago

正規表現ライブラリの共通的な問題かもしれないが、皆さんの影響力を行使して、改善されることを希望します。
少なくともrexEx.compileでは、与えた正規表現の不正を摘出して欲しいと思います。

Updated by duerst (Martin Dürst) about 7 years ago

I haven't yet had time to analyze the regular expression in detail. Are you sure this is really an infinite loop, and not just a case of exponentially slow processing (e.g. similar to stackoverflow.com/questions/16580665/node-js-regex-engine-fails-on-large-input)?

Updated by naruse (Yui NARUSE) about 7 years ago

  • Status changed from Open to Rejected

323字目の * を削ってください。

(?=\W)\W?(?:(?:(?:\/\*<)?[\'\"]?([\w_\$]+)[\'\"]?(?:>\*\/\s+this)?|(this))\.(\$scope|json)|([\'\"](?:\$scope|json)[\'\"]))((\.)(inputObjects)|(\.)(spreadSheets)(?:\.[\'\"]?([^\.\'\"\|\s,]+)[\'\"]?|\[[\'\"]?((?:[^\'\"\]\[]*|\[[^\]]*\])*)[\'\"]?\])(\.)(model))\s*(?:\.[\'\"]?([^\.\'\"\|\s,]+)[\'\"]?|\[[\'\"]?((?:[^\'\"\]\[]|\[[^\]]*\])*)[\'\"]?\])

正規表現の解析には例えば以下のサイトを使うことが出来ます。
https://regex101.com/

この類の正規表現を検出するのはトレードオフがあるのでRubyもnodeも標準では難しいと思いますが、
nodeならば例えば https://github.com/uhop/node-re2 だとこの問題を回避できます。

Updated by jumnichi_kose (小瀬 淳一) about 7 years ago

Thank' to @naruse (Yui NARUSE)

https://regex101.com/

Your's workaround is fine.

It OK that "Somewhere in my regular expression was bad".
The RegExp include "Catastrophic backtracking".

A similar regular expression does not fall into an infinite loop.

r = /\[((?:[^none of\]]*|other)*)\]/
a="[data]"
a[0...(-2)].scan r # => []
a.scan r # =>  [["data"]]

The other workaround was to not use "none of".

r = /(?=\W)\W?(?:(?:(?:\/\*<)?[\'\"]?([\w_\$]+)[\'\"]?(?:>\*\/\s+this)?|(this))\.(\$scope|json)|([\'\"](?:\$scope|json)[\'\"]))((\.)(inputObjects)|(\.)(spreadSheets)(?:\.[\'\"]?([\w_\$]+)[\'\"]?|\[[\'\"]?((?:[\w\s(\.)+\-\/_|!]*|\[[\w\s(\.)+\-\/_|!]\])*)[\'\"]?\])(\.)(model))\s*(?:\.[\'\"]?([\w_\$]+)[\'\"]?|\[[\'\"]?((?:[\w\s(\.)+\-\/_|!]+|\[[\w\s(\.)+\-\/_|!]*\]|\/\*[\w\s(\.)+\-\/_|!]+\*\/)+)[\'\"]?\])/

txt = ".model |,| 'anController'.$scope.inputObjects['fieldName .../* !! comment ...*/]"

txt.scan r # =>  [["anController", nil, "$scope", nil, ".inputObjects", ".", "inputObjects", nil, nil, nil, nil, nil, nil, nil, "fieldName .../* !! comment ...*/"]]

txt[0..(-2)].scan # => []

奇怪な挙動に戸惑いましたが、紹介されたサイトの”Catastrophic backtracking”という警告メッセージが表示される機能によって、何が悪かったのかを調べる手段についても、解決することができました。
ありがとうございました。

Actions #5

Updated by nobu (Nobuyoshi Nakada) about 7 years ago

  • Description updated (diff)
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0