Feature #18461
closedclosures are capturing unused variables
Description
def foo
a = 1
->{}
end
p foo.binding.local_variables # [:a]
Shouldn't a
be optimized away? Like v8 does (https://bugs.chromium.org/p/v8/issues/detail?id=3491)
Updated by jeremyevans0 (Jeremy Evans) about 3 years ago
This is Ruby, you can always modify the value of the local variable later, even if it isn't accessed inside the proc itself.
Such access can affect other scopes:
def foo
a = 1
[->{}, ->{}]
end
x, y = foo
x.binding.local_variable_get(:a) # => 1
y.binding.local_variable_get(:a) # => 1
x.binding.local_variable_set(:a, 2)
x.binding.local_variable_get(:a) # => 2
y.binding.local_variable_get(:a) # => 2
I would guess that changing this to optimize away the local variable in the proc's binding would break code.
Updated by bughit (bug hit) about 3 years ago
Interning string literals also breaks code, but it's worth moving in that direction, and here also.
Note the following quote from chromium/v8:
The only solution I could think of is that whenever devtools is on, we would deopt all code and recompile with forced context allocation. That would dramatically regress performance with devtools enabled though.
"forced context allocation" refers to capturing everything in scope as ruby currently does
so this optimization is considered "dramatically" valuable
Updated by Eregon (Benoit Daloze) about 3 years ago
I think Ruby JIT implementers are already aware of this, but these are the current semantics of Ruby.
I think JRuby does some optimization like this (only keeps the outside scope if there is binding/eval
inside the block, but JRuby does not account for aliases)
That optimization is however unsound and incompatible, although maybe in practice not that big a deal.
$ juby -ve 'module Kernel; alias b binding; end; def foo; a = 1; -> { b }; end; p foo.call.local_variable_get :a'
jruby 9.3.2.0 (2.6.8) 2021-12-01 0b8223f905 OpenJDK 64-Bit Server VM 11.0.10+9 on 11.0.10+9 +jit [linux-x86_64]
-e:1: warning: assigned but unused variable - a
-e:2: warning: Kernel#binding accesses caller method's state and should not be aliased
NameError: local variable `a' not defined for #<Binding:0x79df80a4>
As long as Proc#binding exists and it's possible to alias binding
, I think it's problematic to change semantics for compatibility and soundness.
Proc#binding's purpose AFAIK is to access variables in the outer scope (it cannot variables defined inside the block).
But, maybe we could deprecate Proc#binding and remove it, I think that would be a good start.
What do committers and people think about that?
Updated by bughit (bug hit) about 3 years ago
But, maybe we could deprecate Proc#binding and remove it, I think that would be a good start.
There's no such need. If the capture is optimized then Proc#binding will continue giving access to what has been captured, rather than everything that could be captured.
Updated by jeremyevans0 (Jeremy Evans) about 3 years ago
bughit (bug hit) wrote in #note-4:
But, maybe we could deprecate Proc#binding and remove it, I think that would be a good start.
There's no such need. If the capture is optimized then Proc#binding will continue giving access to what has been captured, rather than everything that could be captured.
This is not how it should be implemented, IMO. Optimizing such procs is fine, as long as such optimization is transparent. However, as soon as you call proc.binding.local_variables, you should see all local variables in scope (deoptimizing if necessary). Otherwise you break backwards compatibility.
Updated by bughit (bug hit) about 3 years ago
However, as soon as you call proc.binding.local_variables, you should see all local variables in scope (deoptimizing if necessary)
That's not possible. Captured variables are on the heap, uncaptured ones are on the stack and disappear with the stack frame. You can't get them back later.
Otherwise you break backwards compatibility.
Which needs to judged against the value of what's being gained. I suspect that dependence on unoptimized capture is not common, important, valuable.
Updated by bughit (bug hit) about 3 years ago
uncaptured ones are on the stack and disappear with the stack frame. You can't get them back later.
That's actually one of the major benefits of optimized capture, you're not wasting memory holding on to things you don't need. There could be huge object graphs pointlessly inadvertently kept alive by unoptimized capture.
Updated by Eregon (Benoit Daloze) about 3 years ago
bughit (bug hit) wrote in #note-4:
There's no such need. If the capture is optimized then Proc#binding will continue giving access to what has been captured, rather than everything that could be captured.
That's of no use then, that's just not the semantics of Proc#binding, which is to capture every variable at the place the block is defined.
The block could also use eval
, or an alias of eval
and then how could you know if the block would access an outer variable?
Hence Proc#binding would need to be removed before we can sensibly optimize any of that, and we'd need constraints on aliasing eval's and binding's probably.
Ruby does reflect updates in the original frame, so you'd still have an indirection for every such outer variable, both by the block but also in the original frame (at least from the point the block is defined).
There could be huge object graphs pointlessly inadvertently kept alive by unoptimized capture.
I doubt this is a big issue in practice, there seems to be few reports about it and most Proc instances are not kept alive long. Procs also can't be serialized.
It's an interesting and well-known optimization, but I think one that would need to significantly change Ruby semantics to be feasible without breaking many things.
I think it's valuable to figure out exactly what we'd need to change and if the Ruby team/matz would support that.
OTOH, I believe just breaking compatibility blindly to optimize will never be accepted.
Updated by bughit (bug hit) about 3 years ago
That's of no use then, that's just not the semantics of Proc#binding, which is to capture every variable at the place the block is defined.
If you are suggesting that everything is set in stone, that's obviously false. Changes can be justified when not catastrophically disruptive and when the benefits outweigh the disruption.
Hence Proc#binding would need to be removed
There no logic in this conclusion. Returning a binding that represents what the proc actually captures is not worse than having no binding.
The block could also use eval
V8 handles that, detecting eval and deopting, so ruby can too.
but I think one that would need to significantly change Ruby semantics to be feasible without breaking many things.
"Significantly" is an overstatement. What many things will it break? Taking a dependency on procs capturing variables they don't use, does not seem like a realistic and common use case.
Updated by Eregon (Benoit Daloze) about 3 years ago
bughit (bug hit) wrote in #note-9:
V8 handles that, detecting eval and deopting, so ruby can too.
So what does V8 do then if eval
is aliased e.g., to e
?
BTW, your replies feel rather aggressive to me.
You are speaking to CRuby core members and Ruby JIT implementers, so I'd think it'd be nice to consider their opinion and not just dismissing them (that's what I feel in several of your replies in this issue).
For example they might know Ruby semantics better.
Also I would personally appreciate to know a little bit more about you, you seem to know quite a bit V8. Did you work on V8 maybe?
Updated by matz (Yukihiro Matsumoto) over 2 years ago
- Status changed from Open to Rejected
Rejected. Ruby and JavaScript are different languages.
Upon existence of eval
, the impact of removing unused variables from closures are far bigger in Ruby. The optimization should not be mandatory.
If there's no side effect (e.g. eval
is never called with the closure), the unused variables might be removed from closures. But it seems nearly impossible to ensure no side effect.
Matz.