Bug #19990
closedCould we reconsider the second argument to Kernel#load?
Description
The documentation of Kernel#load
says:
If the optional wrap parameter is true, the loaded script will be executed under an anonymous module, protecting the calling program’s global namespace. If the optional wrap parameter is a module, the loaded script will be executed under the given module. In no circumstance will any local variables in the loaded file be propagated to the loading environment.
I'd like to ask you to reconsider this feature.
First of all, "protecting the calling program" is not really accomplished because the loaded file may still do this
class ::C # defines ::C regardless of the second argument
end
Another example, if the caller defines a module M
, then the loaded program can also define things in M
:
class M::C # defines ::M::C regardless of the second argument
end
It does not even need a leading ::
.
So, the "protection" is not really there.
In addition to that, this is not transparent for the code being loaded either. For example, let's take this program:
class A
end
module X
::A # could be needed if your own lookup had an A somewhere you want to skip
end
the Ruby programmer expects that to work. But with this feature, on paper, nobody knows if it wil work. How can you ship code confidently?
So, the documentation should say:
It kind of protects, but not really. Also, the loaded file may not work as expected, or may not even be loadable at all.
That hypothetical documentation suggests to me it would be worth revisiting this feature.
In Ruby, as it is today, things are global. The language does not have features to really isolate code as containers do, for example.
I believe the 2nd argument to Kernel#load
steers the API in a direction that is not consistent with the language, and provides a feature that is only partial and cannot satisfy what it promises.
I'd be in favor of deprecating and eventually removing this API.
Updated by rubyFeedback (robert heiler) about 1 year ago
Just two short things:
-
I believe it should be assessed whether this proposed change may
affect existing gems and ruby devs, e. g. how many make use of this
feature. Personally it would not affect me as I do not use the second
parameter to load() (and I use require most of the time, anyway), but
it may affect ruby devs, so that should be included in the suggestion
as a potential trade-off (e. g. folks having to modify their code). -
fxn wrote:
So, the documentation should say:
"It kind of protects, but not really"
I think this change would be problematic, because other ruby users
could wonder what this means, e. g. "it kind of protects, but not
really". This would be confusing to some people and I don't think this
is a good change to the documentation. It would kind of be similar to
this as an extreme example:
"Flowers are plants, but not really."
So the documentation should not be internally inconsistent here.
Updated by fxn (Xavier Noria) about 1 year ago
Sure, it is public API, so the possibility of people using it is a given. I don't think this is used widely, but anyway, that is why my suggestion includes a deprecation cycle.
I don't think this is a good change to the documentation
That is the point. That documentation was an hypothetical exercise only, not a proposal.
It is a reductio ad absurdum argument.
Documentation has to describe the feature. The current feature is not fully described today, it would need to be extended with those remarks. The remarks do not sound good at all. Therefore, the feature itself is dubious.
Updated by jeremyevans0 (Jeremy Evans) about 1 year ago
@fxn (Xavier Noria) is correct that module wrapping offers no protection. We should remove the "protecting the calling program’s global namespace" part. However, the other parts of the documentation are accurate. Just because you are wrapping code in a module doesn't protect the code:
module M
class ::C; end
class A::B; end
end
@fxn (Xavier Noria), assuming we remove the part related to protection, do you still believe we need to add other remarks? It seems to me we do not have any issues if we remove the part about protection, because the part about module wrapping seems easy to understand.
Updated by fxn (Xavier Noria) about 1 year ago
@jeremyevans0 (Jeremy Evans) We certainly could consider that.
Perhaps instead of "protect", the documentation could say what is exactly doing. Like, "pushes the module to the nesting, and changes where top-level methods are defined". This written in a more precise way, perhaps.
However, we still have the lack of transparency. Which is the nesting in your program? Well, it is out of your control! How can you program without knowing which is your nesting?
That is what the last example shows:
class A
end
module X
::A # could be needed if your own lookup had an A somewhere you want to skip
end
That program would run normally, but fail if loaded with load program, true
.
One way to smooth that out could be: "You cannot load arbitrary code and hope it works. Using this feature needs coordination with the loaded file" (off the top of my head too, wording to be polished).
I don't know if that is desirable as a feature, though. What do you think?
Updated by fxn (Xavier Noria) about 1 year ago
Let me add that example with ::A
is just one way to illustrate lack of transparency.
There are many others, for example:
class String # expected to reopen ::String
def foo
end
end
"".foo # expected to work
That program works normally, but raises NoMethodError
if loaded with load program, true
.
Bottom line is, the moment you unilaterally change the nesting to the loaded file, all kind of unexpected things may happen because the assumptions of the programmer of that file do not hold anymore (unless there is coordination).
This cannot be transparent. It is just a matter of looking for the consequences of losing your control over the nesting.
Updated by fxn (Xavier Noria) about 1 year ago
And to help people not familiar with nesting and the constant resolution algorithms explore the ramifications, another example that looks even more innocent:
X = 1
class C
X = 2
end
class D < C
p X
end
That program prints 1 or 2, depending on the second argument of Kernel#load
. This is so because the lookup first checks the nesting, and you lost control of the nesting.
Lack of "protection" in the caller, and lack of transparency in the callee, is what makes me suggest to reconsider this API.
Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 1 year ago
Because of the reasons you identify, Kernel#load
is absolutely not some kind of general module containerisation feature (of the kind perhaps envisaged by https://bugs.ruby-lang.org/issues/19744). But I think it does have a limited use-case for loading particular scripts which are known to pollute the global namespace by accident in an undesirable way.
Although as I write this, I realise Ruby already has this functionality:
irb(main):001> File.read('broken_script.rb')
=> "X = 8\n"
irb(main):002> broken_toplevel_binding = Module.new.tap { _1.class_eval File.read('broken_script.rb') }
=> #<Module:0x00000001048bdf30>
irb(main):003> X
(irb):3:in `<main>': uninitialized constant X (NameError)
from /Users/ktsanaktsidis/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/irb-1.8.3/exe/irb:9:in `<top (required)>'
from /Users/ktsanaktsidis/.rbenv/versions/3.2.2/bin/irb:25:in `load'
from /Users/ktsanaktsidis/.rbenv/versions/3.2.2/bin/irb:25:in `<main>'
irb(main):004> broken_toplevel_binding::X
=> 8
Since Kernel#load
can subtly do the wrong thing on some files that are loaded, and since it's possible to obtain that behaviour explicitly in the cases where it really is desirable, I think deprecating it makes sense.
Updated by fxn (Xavier Noria) about 1 year ago
Yes, besides the module
and class
keywords, the other thing that can change the nesting is the eval STRING
family. For example,
>> String.class_eval 'Module.nesting'
=> [String]
>> String.instance_eval 'Module.nesting'
=> [#<Class:String>]
>> ''.instance_eval 'Module.nesting'
=> [String]
>> Object.new.instance_eval 'Module.nesting'
=> [Object]
However, this API seems more edge and ad-hoc to me, I believe it is acceptable that it has custom rules (that should be documented). I believe these are fine.
Updated by Eregon (Benoit Daloze) about 1 year ago
- Related to Feature #19744: Namespace on read added
Updated by jeremyevans (Jeremy Evans) 11 months ago
- Status changed from Open to Closed
Applied in changeset git|eb8df2fa7aa7b008bd8dbce765694635a564e8f9.
Update Kernel#load documentation to remove phrase related to protection
Code loaded via Kernel#load can modify the global namespace even
if the wrap parameter is provided.
Fixes [Bug #19990]