Bug #20998
openrb_str_locktmp() changes flags of frozen strings and string literals
Description
# frozen_string_literal: true
# BOILERPLATE START
require 'tmpdir'
require 'rbconfig'
def inline_c_extension(c_code)
Dir.mktmpdir('inline_c_extension') do |dir|
File.write("#{dir}/cext.c", c_code)
File.write("#{dir}/extconf.rb", <<~RUBY)
require 'mkmf'
create_makefile('cext')
RUBY
out = IO.popen([RbConfig.ruby, 'extconf.rb'], chdir: dir, &:read)
raise "ruby extconf.rb failed: #{$?.inspect}\n#{out}" unless $?.success?
out = IO.popen(['make'], chdir: dir, &:read)
raise "make failed: #{$?.inspect}\n#{out}" unless $?.success?
require "#{dir}/cext.#{RbConfig::CONFIG['DLEXT']}"
end
end
inline_c_extension <<~C
#include "ruby.h"
static VALUE foo(VALUE self, VALUE str) {
rb_str_locktmp(str);
return str;
}
void Init_cext(void) {
VALUE c = rb_define_class("Foo", rb_cObject);
rb_define_singleton_method(c, "foo", foo, 1);
}
C
# BOILERPLATE END
a = "str"
Foo.foo(a)
# imagine a million lines of code in between
b = "str"
b << "."
# can't modify string; temporarily locked (RuntimeError)
# What? Who "locked" that immutable frozen string literal?
# It should be: can't modify frozen String: "str" (FrozenError)
Same problem with:
Foo.foo("abc")
# imagine a million lines of code in between
Foo.foo("abc") # temporal locking already locked string (RuntimeError)
Related: https://github.com/oracle/truffleruby/issues/3752
It seems a clear bug to mutate a frozen string (with visible side effects), even more so for shared frozen string literals.
I think rb_str_locktmp() should raise (a FrozenError, as it's effectively attempting to mutate it, same as calling rb_str_modify()
) if called on a frozen string, because it makes little sense, I think "locking a string" only makes sense for mutable strings.
The alternative would be to noop on frozen strings for rb_str_locktmp() and rb_str_unlocktmp(), I think that's susprising, and potentially hiding more bugs, e.g. if one by mistake tries to mutate the RSTRING_PTR() or so, and also rb_str_locktmp() wouldn't imply the flag is set after it returns.
Any attempt to mutate a frozen string should fail, so might as well fail early.
Updated by nobu (Nobuyoshi Nakada) 2 days ago
There isn’t a case that locking a string just to avoid being modified or moved, but not to mutate it?
Updated by Eregon (Benoit Daloze) 2 days ago
I'm not aware of any such case, and .freeze
seems much better for that use case.
Updated by byroot (Jean Boussier) 2 days ago
Actually yes, I think it's not an uncommon use case?
e.g. if you need to write a given string into a socket or something like that, you may want to lock it to ensure a concurrent thread won't modify it.
I think zlic.c
uses it this way?
Updated by Eregon (Benoit Daloze) 1 day ago · Edited
@byroot (Jean Boussier) but that's mutating the string, @nobu (Nobuyoshi Nakada) is asking for a case not mutating the string, which I think there is none.
EDIT: I read too fast, "write a given string into a socket" does not imply mutating the string.
Updated by Eregon (Benoit Daloze) 1 day ago
https://github.com/ruby/ruby/blob/975461bf885b8fb791cf048ad39c79924d28e4e9/ext/zlib/zlib.c#L1056 but I'm not quite sure how that String is used.
It seems clearly wrong that two threads using Zlib with a frozen String literal (which happens to have the same contents/characters for both threads) would fail with temporal locking already locked string (RuntimeError)
though.
So what should we do then when rb_str_locktmp() is called on a frozen string?
Updated by byroot (Jean Boussier) about 20 hours ago
It seems clearly wrong that two threads using Zlib with a frozen String literal (which happens to have the same contents/characters for both threads) would fail with temporal locking already locked string (RuntimeError) though.
That I totally agree.