Feature #21039
openRactor.make_shareable breaks block semantics (seeing updated captured variables) of existing blocks
Description
def make_counter
count = 0
nil.instance_exec do
[-> { count }, -> { count += 1 }]
end
end
get, increment = make_counter
reader = Thread.new {
sleep 0.01
loop do
p get.call
sleep 0.1
end
}
writer = Thread.new {
loop do
increment.call
sleep 0.1
end
}
ractor_thread = Thread.new {
sleep 1
Ractor.make_shareable(get)
}
sleep 2
This prints:
1
2
3
4
5
6
7
8
9
10
10
10
10
10
10
10
10
10
10
10
But it should print 1..20, and indeed it does when commenting out the Ractor.make_shareable(get)
.
This shows a given block/Proc instance is concurrently broken by Ractor.make_shareable
, IOW Ractor is breaking fundamental Ruby semantics of blocks and their captured/outer variables or "environment".
It's expected that Ractor.make_shareable
can freeze
objects and that may cause some FrozenError, but here it's not a FrozenError, it's wrong/stale values being read.
I think what should happen instead is that Ractor.make_shareable
should create a new Proc and mutate that.
However, if the Proc is inside some other object and not just directly the argument, that wouldn't work (like Ractor.make_shareable([get])
).
So I think one fix would to be to only accept Procs for Ractor.make_shareable(obj, copy: true)
.
FWIW that currently doesn't allow Procs, it gives <internal:ractor>:828:in 'Ractor.make_shareable': allocator undefined for Proc (TypeError)
.
It makes sense to use copy
here since make_shareable
effectively takes a copy/snapshot of the Proc's environment.
I think the only other way, and I think it would be a far better way would be to not support making Procs shareable with Ractor.make_shareable
.
Instead it could be some new method like isolated { ... }
or Proc.isolated { ... }
or Proc.snapshot_outer_variables { ... }
or so, only accepting a literal block (to avoid mutating/breaking an existing block), and that would snapshot outer variables (or require no outer variables like Ractor.new's block, or maybe even do Ractor.make_shareable(copy: true)
on outer variables) and possibly also set self
since that's anyway needed.
That would make such blocks with different semantics explicit, which would fix the problem of breaking the intention of who wrote that block and whoever read that code, expecting normal Ruby block semantics, which includes seeing updated outer variables.
Related: #21033 https://bugs.ruby-lang.org/issues/18243#note-5
Extracted from https://bugs.ruby-lang.org/issues/21033#note-14
Updated by Eregon (Benoit Daloze) 7 months ago
- ruby -v set to ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-linux]
Updated by Eregon (Benoit Daloze) 7 months ago
- Related to Bug #18243: Ractor.make_shareable does not freeze the receiver of a Proc but allows accessing ivars of it added
Updated by Eregon (Benoit Daloze) 7 months ago
- Related to Feature #21033: Allow lambdas that don't access `self` to be Ractor shareable added
Updated by luke-gru (Luke Gruber) 7 months ago
As far as I know this is intentional behavior, so even though I agree it is confusing I think this is more accurately a feature request instead of a bug.
Updated by Eregon (Benoit Daloze) 7 months ago
I think it's a bug, because it breaks fundamental Ruby block semantics.
No Ractor method or functionality should be able to do that for an existing Proc, even more so when that Proc is called on the main Ractor.
Updated by luke-gru (Luke Gruber) 7 months ago
Okay fair enough, and it's not of much consequence either way whether it's a bug or a feature because your point still stands.
Updated by matz (Yukihiro Matsumoto) 7 months ago
- Tracker changed from Bug to Feature
- ruby -v deleted (
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-linux]) - Backport deleted (
3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN)
- It's intended behavior. We don't call it a bug. If we keep the non-shareable semantics, the basic principle of Ractors would fail (no shared state).
- I understand your feeling to the new behavior. The behavior can be negotiable, but it should follow Ractor principle.
Matz.
Updated by hsbt (Hiroshi SHIBATA) 5 months ago
- Status changed from Open to Assigned
Updated by Eregon (Benoit Daloze) about 1 month ago
I think a good solution here would be:
- raise on
Ractor.make_shareable(proc)
- raise on
Ractor.make_shareable(proc, copy: true)
- allow
Ractor.make_shareable { ... }
but only with a block literal. That way, the fact that block behaves differently by copying its environment is very clear. The method could check that the block doesn't do assignments in the environment as well to avoid surprises. It could also set theself
in the block tonil
, and/or take a keyword argument to set the receiver, or use the block's original self and error if it cannot be made shareable.
Updated by matz (Yukihiro Matsumoto) about 1 month ago
I'd like to have Ractor.shareble_proc
and Ractor.sharable_lambda
. See the agenda from 20250710 developer meeting.
Matz.
Updated by mame (Yusuke Endoh) about 1 month ago
-
Ractor.shareable_proc { }
returns a Proc that is shareable between ractors. In the proc,self
is nil. -
Ractor.shareable_proc(self: 42) { }
returns a Proc that is shareable between ractors. In the proc,self
is42
. -
Ractor.shareable_lambda
returns a lambda-version ofRactor.shareable_proc
.
Updated by mame (Yusuke Endoh) about 1 month ago
The followings are also approved; changing an existing proc object to shareable should be prohibited.
- raise on
Ractor.make_shareable(proc)
- raise on
Ractor.make_shareable(proc, copy: true)
Updated by Eregon (Benoit Daloze) 26 days ago
Great to hear, this makes a lot of sense and addresses the original semantics issue perfectly.
Updated by tenderlovemaking (Aaron Patterson) 19 days ago
I think these make sense, but I would also like to propose that Ractor.shareable_proc
take a block that isn't a literal and returns a new proc that is shareable.
For example:
not_shareable = ->{ ... }
shareable = Ractor.shareable_proc(¬_shareable)
I think forcing the proc to be a literal put a very big limitation on Ractor usefulness. A simple example is a Sinatra app:
# A simple Sinatra app
get "/" do
"Hello world"
end
Since the Sinatra API uses procs, we wouldn't be able to serve the Sinatra request from a Ractor. If we can pass a non-literal block to Ractor.shareable_proc
, I think that would solve the issue.
Updated by Eregon (Benoit Daloze) 19 days ago
· Edited
@tenderlovemaking (Aaron Patterson) The issue with that is it still breaks the block semantics as in the OP description, specifically reading of captured variables inside the block is snapshotted for Ractor-shareable-blocks:
$ ruby -e 'count = 0; b = nil.instance_exec { -> { count } }; p b.call; count += 1; p b.call'
0
1
$ ruby -e 'count = 0; b = nil.instance_exec { -> { count } }; b2 = Ractor.shareable_proc(&b); p b2.call; count += 1; p b2.call'
0
0
It's the general guarantee in Ruby that a given literal block always behave the same, e.g. it's either a proc or lambda, but not both (except when using send(condition ? :proc : :lambda) { ... }
but that's explicit then), and in this case it's either a block respecting updated captures variables or not.
So how to keep this important guarantee (i.e. the block author can rely on captured variables to behave as they always have been for Ruby blocks and respect reassignments) and allow the flexibility you want?
BTW Ractor.make_shareable
on a Proc which assigns captured variables is an error (good, better to fail early than silently ignore the write):
$ ruby -e 'count = 0; b = nil.instance_exec { -> { count += 1 } }; Ractor.make_shareable b'
-e:1:in 'Ractor.make_shareable': can not make a Proc shareable because it accesses outer variables (count). (ArgumentError)
from -e:1:in '<main>'
Maybe one way would be for Ractor.shareable_proc
to be an error if there is any code around that Proc assigning any captured variable?
It can't detect binding
and eval
though, so that's still not complete.
One way for that Sinatra case would be to write:
get "/", &Ractor.shareable_proc {
"Hello world"
}
but this only really works if there are no captured variables, or captured variables are not reassigned and the contents of captured variables is shareable, so probably in many realistic cases it doesn't work anyway.
Updated by tenderlovemaking (Aaron Patterson) 17 days ago
I understand your argument, but I don't agree this is an issue.
ruby -e 'count = 0; b = nil.instance_exec { -> { count } }; b2 = Ractor.shareable_proc(&b); p b2.call; count += 1; p b2.call'
In this code we've explicitly converted b
to a shareable proc, b2
. The value of count
is internally consistent from the standpoint of b2
since shareable_proc
explicitly copied / disconnected the captured environment. I understand the problem you're pointing out, but I'm not convinced it would be a big deal in practice.
get "/", &Ractor.shareable_proc {
"Hello world"
}
This can't be a serious suggestion? It's basically saying that no existing Sinatra code could run inside a Ractor based webserver. If we had new syntax for Ractor.shareable_proc
, I could see that being easier to swallow, but this doesn't seem acceptable (to me at least).
Updated by Eregon (Benoit Daloze) 17 days ago
tenderlovemaking (Aaron Patterson) wrote in #note-18:
This can't be a serious suggestion? It's basically saying that no existing Sinatra code could run inside a Ractor based webserver. If we had new syntax for
Ractor.shareable_proc
, I could see that being easier to swallow, but this doesn't seem acceptable (to me at least).
Yeah, I understand your concern, and I meant this mostly as a workaround, while finding what other parts of Ractor prevents using Ractor for realistic Ruby code.
OTOH I'm rather skeptical that even if Ractor.shareable_proc
would be allowed with a non-literal block that we'd be able to run Sinatra (or Rails, etc) apps on Ractors.
Not even MSpec runs on Ractor, and that's pretty simple logic, yet making it Ractor-compatible in a non-ugly-and-complicated way seems very hard.
I think it would be good to have a construct (be it syntax or a Kernel or Proc method), independent of Ractor, to create a Proc which snapshots its environment, and is not allowed to write to its environment.
That way, that Proc would have the same semantics whether Ractors are used or not.
That concept on its own is very useful for optimizations and JITs, in fact TruffleRuby already has this functionality internally.
One complication though is it's pretty expensive to do this, as on every call to that method it allocates a new Proc and potentially copies + change the bytecode to replace captured variable reads with their values (thought there might be other ways to do this).
Having it as syntax or an intrinisified method (which would mean cannot be redefined and must be detectable at parse time, no metaprogramming call to it) would help.
It would be useful for define_method
too, and would mean methods defined with define_method
could be as fast as def
when called.
To make it useful for Ractor we'd need that construct to also support setting the receiver, as in the Ractor.shareable_proc(self: 42) { }
case from above.
And it would also need to either check that captured variable values are shareable, or make them shareable. That part is a bit weird when not using Ractors though, especially making shareable.
Checking for shareable seems better anyway, because making them shareable would need to copy to be safe in general, but maybe the user want to do it inplace if they know that's safe, etc.
It could be something like:
captured = 7
p = Proc.isolated(self: 6) { self * captured }
captured = 10
p.call # => 42
Syntax seems better-defined for the semantics and probably would look cleaner, but I'm not sure what would be a good syntax, and then of course it can't work on older Ruby versions at all, even when not using Ractors.
Unless we use something cheeky like proc { |a,b; isolated| }
maybe, but that wouldn't allow setting the self (would have to be done with instance_exec
around it), and would have different semantics on different versions which is not great.
Updated by tenderlovemaking (Aaron Patterson) 17 days ago
Eregon (Benoit Daloze) wrote in #note-19:
tenderlovemaking (Aaron Patterson) wrote in #note-18:
This can't be a serious suggestion? It's basically saying that no existing Sinatra code could run inside a Ractor based webserver. If we had new syntax for
Ractor.shareable_proc
, I could see that being easier to swallow, but this doesn't seem acceptable (to me at least).Yeah, I understand your concern, and I meant this mostly as a workaround, while finding what other parts of Ractor prevents using Ractor for realistic Ruby code.
OTOH I'm rather skeptical that even ifRactor.shareable_proc
would be allowed with a non-literal block that we'd be able to run Sinatra (or Rails, etc) apps on Ractors.
Not even MSpec runs on Ractor, and that's pretty simple logic, yet making it Ractor-compatible in a non-ugly-and-complicated way seems very hard.
I don't have any numbers, but my intuition is that most non-literal, global procs (procs that are reachable via the application), including ones provided by the user, don't depend on environment mutations that occur outside of the block. Outside of iterators, depending on mutations to one's captured environment would be extremely confusing and hard to track behavior, so IME most people don't do it in practice.
But besides that, I'm just proposing that Ractor.shareable_proc
be allowed to take a non-literal block. This would allow frameworks to pick and choose which lambdas should be "safe" for a Ractor.
If someone had a Sinatra app that depended on env mutations like this:
counter = 0
get "/" do # assume the proc gets copied here so counter is 0
"Hello world #{counter}"
counter += 1
end
counter += 123
I think a user running a Ractor webserver would report an issue with the webserver since it would behave "as expected" on a non-Ractor webserver.
I think it would be good to have a construct (be it syntax or a Kernel or Proc method), independent of Ractor, to create a Proc which snapshots its environment, and is not allowed to write to its environment.
That way, that Proc would have the same semantics whether Ractors are used or not.
I'm not sure why it matters whether the proc can write to the captured env or not, since we can just copy the environment and attach it to the proc. To me this is similar to dup'ing an object. If I mutate one copy, I don't expect those mutations to be reflected in the other.
That concept on its own is very useful for optimizations and JITs, in fact TruffleRuby already has this functionality internally.
One complication though is it's pretty expensive to do this, as on every call to that method it allocates a new Proc and potentially copies + change the bytecode to replace captured variable reads with their values (thought there might be other ways to do this).
Having it as syntax or an intrinisified method (which would mean cannot be redefined and must be detectable at parse time, no metaprogramming call to it) would help.
It would be useful fordefine_method
too, and would mean methods defined withdefine_method
could be as fast asdef
when called.
I've thought of doing this by mprotecting the escaped env and only allowing reads 😅
To make it useful for Ractor we'd need that construct to also support setting the receiver, as in the
Ractor.shareable_proc(self: 42) { }
case from above.
And it would also need to either check that captured variable values are shareable, or make them shareable. That part is a bit weird when not using Ractors though, especially making shareable.
Checking for shareable seems better anyway, because making them shareable would need to copy to be safe in general, but maybe the user want to do it inplace if they know that's safe, etc.
It could be something like:captured = 7 p = Proc.isolated(self: 6) { self * captured } captured = 10 p.call # => 42
Syntax seems better-defined for the semantics and probably would look cleaner, but I'm not sure what would be a good syntax, and then of course it can't work on older Ruby versions at all, even when not using Ractors.
Unless we use something cheeky likeproc { |a,b; isolated| }
maybe, but that wouldn't allow setting the self (would have to be done withinstance_exec
around it), and would have different semantics on different versions which is not great.
Anyway, by not allowing non-literal blocks, we can't even abstract calls to shareable_proc
and I think that really hampers the usefulness of Ractors. I really think we should find a way to support non-literal blocks.
Updated by tenderlovemaking (Aaron Patterson) 5 days ago
After chatting a bit with @Eregon (Benoit Daloze), I'd like to make a proposal about Ractor shareable procs.
I think it's going to be very difficult to port existing code to use Ractors if we can't have some way of creating shareable procs from non-literal blocks. For example we wouldn't be able to use a Ractor based webserver with existing Sinatra applications.
With regard to local variables captured in environments, I think we should have the following rules:
- Allow as many local writes as the user wants before the lambda captures the environment
- Disallow any writes to the "now shared" environment after the lambda captures it
- Disallow writes to the shared environment from inside the lambda
These rules should only apply to local variables declared outside the block.
Here are some examples to demonstrate the rules:
## OK
foo = 123
Ractor.shareable_proc { foo }
p foo
# NG: Raise an exception when creating a shareable proc
# The reason is because we're setting a local after the proc
# is created. This can cause possible race conditions / crashes
foo = 123
Ractor.shareable_proc { foo }
foo = Object.new # reassignment isn't allowed
# NG: Raise an exception when creating a shareable proc
# The proc shouldn't be allowed to mutate a shared environment
foo = 123
Ractor.shareable_proc {
foo += 1 # Not allowed because other env can't see mutation
}
# Works, but value of `foo` may be unexpected.
# The second assignment should be ignored because the env is copied
foo = 123
Ractor.shareable_proc { foo }
eval("foo = Object.new")
# Works, but value of `foo` outside the proc may be unexpected
# Proc only mutates its copied env
foo = 123
Ractor.shareable_proc {
eval("foo += 1")
}
If we can enforce these rules, then I think it should be fine for Ractor.shareable_proc
to take a block that isn't a literal. I also think this would allow a much larger number of existing proc objects to work safely with Ractors.
Updated by Eregon (Benoit Daloze) 4 days ago
· Edited
I agree this should make Ractor.shareable_proc
safe enough with a non-literal block and address the semantics issue in the OP.
Cases with eval
seem not possible to know and it seems rare enough to be OK to behave with "snapshot/copy of environment" semantics in that case.
The rules are quite similar to the rules for the Ractor.new {}
block, which could use the same rules (Ractor.new {}
is currently stricter as it does not allow reading an environment variable, but it should/could for convenience & consistency).
Of course there should still be a check that environment variables the block captures are shareable, which Ractor.make_shareable(Proc)
already does:
nil.instance_exec { a = Object.new; Ractor.make_shareable(proc { a }) }
# can not make shareable Proc because it can refer unshareable object #<Object:0x00007ff7ef878360> from variable 'a' (Ractor::IsolationError)
And same for the self
around the block:
a = 1; Ractor.make_shareable(proc { a })
# Proc's self is not shareable: #<Proc:0x00007faf42016f00 -e:1> (Ractor::IsolationError)
Regarding the reason for the 2nd example in @tenderlovemaking (Aaron Patterson) 's comment, it wouldn't cause race conditions / crashes because shareable_proc
already makes a copy of the environment anyway (Aaron told me).
But it would cause the semantics issue in the OP, that an assignment to outer variable is not observed and breaks Ruby block semantics, hence that case must be an exception from Ractor.shareable_proc
, e.g. a Ractor::IsolationError
or ArgumentError
.
EDIT: Ractor.shareable_proc
should also make a copy of the given Proc, to ensure it never mutates the original Proc in any way.