Project

General

Profile

Actions

Backport #2644

closed

memory over-allocation with regexp

Added by ghazel (Greg Hazel) about 14 years ago. Updated almost 5 years ago.

Status:
Closed
Assignee:
-
[ruby-core:27791]

Description

=begin
Using a simple regular expression ruby allocates far too much memory, and can stack overflow.

Code:
p 1
s = "2" + (" " * 84149170)
p 2
s.match(/(\d) (.*)/)
p 3

Output:
1
2
hmm.rb:4:in `match': Stack overflow in regexp matcher: /(\d) (.*)/ (RegexpError)
from hmm.rb:4

Stack overflow is not the worst of it. It's actually trying to allocate very large amounts of memory. Here is the output of REE, which prints when malloc tries to grab a lot:

1
2
tcmalloc: large alloc 1090519040 bytes == 0x49867000 @
tcmalloc: large alloc 2181038080 bytes == 0x8aa67000 @
tcmalloc: large alloc 18446744072140881920 bytes == (nil) @
tcmalloc: large alloc 4362076160 bytes == (nil) @
hmm.rb:4:in `match': Stack overflow in regexp matcher: /(\d) (.*)/ (RegexpError)
from hmm.rb:4

External observation of processes show that this is memory over-allocation occurs across normal builds of 1.8.6, 1.8.7 and even 1.9.1

(Before you say "this is just a problem with regexp in general!", I tested the same thing on python and perl. Both work satisfactorily with even larger strings.)
=end

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0