Project

General

Profile

Actions

Bug #20998

closed

rb_str_locktmp() changes flags of frozen strings and string literals

Bug #20998: rb_str_locktmp() changes flags of frozen strings and string literals

Added by Eregon (Benoit Daloze) 10 months ago. Updated 4 months ago.

Status:
Closed
Target version:
ruby -v:
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-linux]
[ruby-core:120465]

Description

# frozen_string_literal: true
# BOILERPLATE START

require 'tmpdir'
require 'rbconfig'

def inline_c_extension(c_code)
  Dir.mktmpdir('inline_c_extension') do |dir|
    File.write("#{dir}/cext.c", c_code)
    File.write("#{dir}/extconf.rb", <<~RUBY)
    require 'mkmf'
    create_makefile('cext')
    RUBY

    out = IO.popen([RbConfig.ruby, 'extconf.rb'], chdir: dir, &:read)
    raise "ruby extconf.rb failed: #{$?.inspect}\n#{out}" unless $?.success?

    out = IO.popen(['make'], chdir: dir, &:read)
    raise "make failed: #{$?.inspect}\n#{out}" unless $?.success?

    require "#{dir}/cext.#{RbConfig::CONFIG['DLEXT']}"
  end
end

inline_c_extension <<~C
#include "ruby.h"

static VALUE foo(VALUE self, VALUE str) {
  rb_str_locktmp(str);
  return str;
}

void Init_cext(void) {
  VALUE c = rb_define_class("Foo", rb_cObject);
  rb_define_singleton_method(c, "foo", foo, 1);
}
C

# BOILERPLATE END

a = "str"
Foo.foo(a)

# imagine a million lines of code in between

b = "str"

b << "."
# can't modify string; temporarily locked (RuntimeError)
# What? Who "locked" that immutable frozen string literal?
# It should be: can't modify frozen String: "str" (FrozenError)

Same problem with:

Foo.foo("abc")

# imagine a million lines of code in between

Foo.foo("abc") # temporal locking already locked string (RuntimeError)

Related: https://github.com/oracle/truffleruby/issues/3752

It seems a clear bug to mutate a frozen string (with visible side effects), even more so for shared frozen string literals.

I think rb_str_locktmp() should raise (a FrozenError, as it's effectively attempting to mutate it, same as calling rb_str_modify()) if called on a frozen string, because it makes little sense, I think "locking a string" only makes sense for mutable strings.

The alternative would be to noop on frozen strings for rb_str_locktmp() and rb_str_unlocktmp(), I think that's susprising, and potentially hiding more bugs, e.g. if one by mistake tries to mutate the RSTRING_PTR() or so, and also rb_str_locktmp() wouldn't imply the flag is set after it returns.
Any attempt to mutate a frozen string should fail, so might as well fail early.

Updated by Eregon (Benoit Daloze) 10 months ago Actions #1

  • Description updated (diff)

Updated by Eregon (Benoit Daloze) 10 months ago Actions #2

  • Description updated (diff)

Updated by Eregon (Benoit Daloze) 10 months ago Actions #3

  • Description updated (diff)

Updated by Eregon (Benoit Daloze) 10 months ago Actions #4

  • Description updated (diff)

Updated by nobu (Nobuyoshi Nakada) 10 months ago Actions #5 [ruby-core:120493]

There isn’t a case that locking a string just to avoid being modified or moved, but not to mutate it?

Updated by Eregon (Benoit Daloze) 10 months ago Actions #6 [ruby-core:120495]

I'm not aware of any such case, and .freeze seems much better for that use case.

Updated by byroot (Jean Boussier) 10 months ago Actions #7 [ruby-core:120497]

Actually yes, I think it's not an uncommon use case?

e.g. if you need to write a given string into a socket or something like that, you may want to lock it to ensure a concurrent thread won't modify it.

I think zlic.c uses it this way?

Updated by Eregon (Benoit Daloze) 10 months ago · Edited Actions #8 [ruby-core:120499]

@byroot (Jean Boussier) but that's mutating the string, @nobu (Nobuyoshi Nakada) is asking for a case not mutating the string, which I think there is none.

EDIT: I read too fast, "write a given string into a socket" does not imply mutating the string.

Updated by Eregon (Benoit Daloze) 10 months ago Actions #9 [ruby-core:120500]

https://github.com/ruby/ruby/blob/975461bf885b8fb791cf048ad39c79924d28e4e9/ext/zlib/zlib.c#L1056 but I'm not quite sure how that String is used.
It seems clearly wrong that two threads using Zlib with a frozen String literal (which happens to have the same contents/characters for both threads) would fail with temporal locking already locked string (RuntimeError) though.
So what should we do then when rb_str_locktmp() is called on a frozen string?

Updated by byroot (Jean Boussier) 10 months ago Actions #10 [ruby-core:120520]

It seems clearly wrong that two threads using Zlib with a frozen String literal (which happens to have the same contents/characters for both threads) would fail with temporal locking already locked string (RuntimeError) though.

That I totally agree.

Updated by Eregon (Benoit Daloze) 10 months ago Actions #11 [ruby-core:120583]

@nobu (Nobuyoshi Nakada) @byroot (Jean Boussier) So do you think rb_str_locktmp() should noop on frozen strings (since they can't change anyway), or it should raise because the usage seems somewhat dubious?
I'm fine with either, I'd just like a decision so we can do the same in TruffleRuby.
(the current status is broken so that's worse than either fix)

Updated by byroot (Jean Boussier) 10 months ago Actions #12 [ruby-core:120585]

I prefer the noop solution, as it makes it a much easier to use API. But no strong opinion.

Updated by mame (Yusuke Endoh) 9 months ago Actions #13 [ruby-core:120981]

Discussed at the dev meeting. @akr (Akira Tanaka) preferred raising an Exception. The current purpose of rb_str_locktmp is to prevent the RSTRING_PTR from being moved (by realloc, compaction, etc.) when passing the pointer to a C function that blocks and writes to the buffer (typically, read(2) syscall). In other words, rb_str_locktmp should be performed on a String that is about to be written. We couldn't find a use case for rb_str_locktmp on a frozen String. Unless there is a valid use case for that, it would be good to raise an Exception, instead of no-op.

Updated by Eregon (Benoit Daloze) 5 months ago Actions #14 [ruby-core:122142]

mame (Yusuke Endoh) wrote in #note-13:

Unless there is a valid use case for that, it would be good to raise an Exception, instead of no-op.

FYI TruffleRuby implemented FrozenError when rb_str_locktmp() is used on a frozen String in https://github.com/oracle/truffleruby/pull/3867.

It would be great if CRuby can do the same for 3.5.

Updated by Eregon (Benoit Daloze) 4 months ago Actions #15 [ruby-core:122532]

  • Assignee set to Eregon (Benoit Daloze)

Updated by ioquatix (Samuel Williams) 4 months ago Actions #16 [ruby-core:122535]

My usage of rb_str_locktmp isn't about mutation, it's about making sure the pointer doesn't change. Could compacting GC cause embedded frozen strings to be moved?

Updated by Eregon (Benoit Daloze) 4 months ago Actions #17

  • Status changed from Open to Closed

Applied in changeset git|83fb07fb2c97b9922450979fa4a56f43324317a9.


[Bug #20998] Check if the string is frozen in rb_str_locktmp() & rb_str_unlocktmp()

Updated by Eregon (Benoit Daloze) 4 months ago Actions #18 [ruby-core:122541]

ioquatix (Samuel Williams) wrote in #note-16:

My usage of rb_str_locktmp isn't about mutation, it's about making sure the pointer doesn't change. Could compacting GC cause embedded frozen strings to be moved?

rb_str_locktmp() doesn't prevent GC compaction to move the String, more details about that on the PR.

Updated by Eregon (Benoit Daloze) 4 months ago Actions #19

  • Target version set to 3.5

Updated by hanazuki (Kasumi Hanazuki) 4 months ago Actions #20 [ruby-core:122542]

FYI: IO::Buffer is now broken (by a recent change) because the embedded Strings are not pinned, and I'm proposing a fix at #21210

Updated by Eregon (Benoit Daloze) 4 months ago · Edited Actions #21 [ruby-core:122544]

@hanazuki Thanks for the link, I wish I found that earlier. It confirms that it was already broken which is the same conclusion I came to.

hanazuki (Kasumi Hanazuki) wrote in #note-20:

FYI: IO::Buffer is now broken (by a recent change)

Which change do you mean? git|6012145299cfa4ab561360c78710c7f2941a7e9d ?

EDIT: reply from hanazuki somehow not showing here: Yes. the commit "Mark rb_io_buffer_type references declaratively"

Updated by Anonymous 4 months ago Actions #22


Actions

Also available in: PDF Atom