Project

General

Profile

Backport #2644

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago

=begin 
  
  Using a simple regular expression ruby allocates far too much memory, and can stack overflow. 
 
  Code: 
  p 1 
  s = "2" + (" " * 84149170) 
  p 2 
  s.match(/(\d) (.*)/) 
  p 3 
 
  Output: 
  1 
  2 
  hmm.rb:4:in `match': Stack overflow in regexp matcher: /(\d) (.*)/ (RegexpError) 
          from hmm.rb:4 
 
  Stack overflow is not the worst of it. It's actually trying to allocate very large amounts of memory. Here is the output of REE, which prints when malloc tries to grab a lot: 
 
  1 
  2 
  tcmalloc: large alloc 1090519040 bytes == 0x49867000 @ 
  tcmalloc: large alloc 2181038080 bytes == 0x8aa67000 @ 
  tcmalloc: large alloc 18446744072140881920 bytes == (nil) @ 
  tcmalloc: large alloc 4362076160 bytes == (nil) @ 
  hmm.rb:4:in `match': Stack overflow in regexp matcher: /(\d) (.*)/ (RegexpError) 
          from hmm.rb:4 
 
  External observation of processes show that this is memory over-allocation occurs across normal builds of 1.8.6, 1.8.7 and even 1.9.1  
 
  (Before you say "this is just a problem with regexp in general!", I tested the same thing on python and perl. Both work satisfactorily with even larger strings.) 
 
 =end 
 

Back