Project

General

Profile

Bug #15916

Updated by mltsy (Joe Marty) over 5 years ago

When interpolating a string inside a Regexp literal, if the string contains a multibyte character loaded from a file (not sure if this covers all the cases, but this is what triggers it for me), Ruby leaks memory. 

 The code below reproduces the problem, while outputting the process memory usage as it rises (get_process_mem gem is required). 

 Ways to avoid the memory leak (although I don't know why) include: 
 1. Using the string literal to define `PATTERN` directly (Not loading it from a file) 
 2. Using `Regexp.new` instead of a literal interpolation (`/#{...}/`) 
 3. Shortening the string to just a few characters (maybe small enough to fit inside a single RVALUE?) 

 ``` ruby 
 require 'get_process_mem' 

 str = "String that doesn't fit into a single RVALUE, with a multibyte char:" + 160.chr(Encoding::UTF_8) 
 File.write('weirdstring.txt', str) 

 class Leak 
   PATTERN = File.read("weirdstring.txt").freeze 

   def test 
     100_000.times { /#{PATTERN}/i } 
   end 
 pattern end 

 t = File.read("weirdstring.txt") Leak.new 

 loop do 
   print "Running... " 

   100_000.times { /#{pattern}/i } t.test 

   puts " process mem: #{GetProcessMem.new.mb.to_i}MB" 
 end 

 ``` 

 Expected Result: 
 Constant memory usage (avoiding the leak produces constant memory usage between 10-20MB) 

 Actual Result: 
 Continual memory growth (it only takes 60 seconds or so to consume 500MB)

Back