Feature #21311
openNamespace on read (revised)
Description
This replaces #19744
Concept¶
This proposes a new feature to define virtual top-level namespaces in Ruby. Those namespaces can require/load libraries (either .rb or native extension) separately from other namespaces. Dependencies of required/loaded libraries are also required/loaded in the namespace.
This feature will be disabled by default at first, and will be enabled by an env variable RUBY_NAMESPACE=1
as an experimental feature.
(It could be enabled by default in the future possibly.)
"on read" approach¶
The "on write" approach here is the design to define namespaces on the loaded side. For example, Java packages are defined in the .java files and it is required to separate namespaces from each other. It can be implemented very easily, but it requires all libraries to be updated with the package declaration. (In my opinion, it's almost impossible in the Ruby ecosystem.)
The "on read" approach is to create namespaces and then require/load applications and libraries in them. Programmers can control namespace separation at the "read" time. So, we can introduce the namespace separation incrementally.
Motivation¶
The "namespace on read" can solve the 2 problems below, and can make a path to solve another problem:
- Avoiding name conflicts between libraries
- Applications can require two different libraries safely which use the same module name.
- Avoiding unexpected globally shared modules/objects
- Applications can make an independent/unshared module instance.
- Multiple versions of gems can be required
- Application developers will have fewer version conflicts between gem dependencies if rubygems/bundler will support the namespace on read. (Support from RubyGems/Bundler and/or other packaging systems will be needed)
For the motivation details, see [Feature #19744].
How we can use Namespace¶
# app1.rb
PORT = 2048
class App
def self.port = ::PORT
def val = PORT.to_s
end
p App.port # 2048
# app2.rb
class Number
def double = self * 2
end
PORT = 2048.double
class App
def self.port = ::PORT
def val = PORT.double.to_s
end
p App.port # 4096
# main.rb - executed as `ruby main.rb`
ns1 = Namespace.new
ns1.require('./app1') # 2048
ns2 = Namespace.new
ns2.require('./app2') # 4096
PORT = 8080
class App
def self.port = ::PORT
def val = PORT.to_s
end
p App.port # 8080
p App.new.val # "8080"
p ns1::App.port # 2048
p ns1::App.new.val # "2048"
p ns2::App.port # 4096
p ns2::App.new.val # "8192"
1.double # NoMethodError
Namespace specification¶
Types of namespaces¶
There are two namespace types, "root" and "user" namespace. "Root" namespace exists solely in a Ruby process, and "user" namespaces can be created as many as Ruby programmers want.
Root namespace¶
Root namespace is a unique namespace to be defined when a Ruby process starts. It only contains built-in classes/modules/constants, which are available without any require
calls, including RubyGems itself (when --disable-gems
is not specified).
At here, "builtin" classes/modules are classes/modules accessible when users' script evaluation starts, without any require/load calls.
User namespace¶
User namespace is a namespace to run users' Ruby scripts. The "main" namespace is the namespace to run the user's .rb
script specified by the ruby
command-line argument. Other user namespaces ("optional" namespaces) can be created by Namespace.new
call.
In user namespace (both main and optional namespaces), built-in class/module definitions are copied from the root namespace, and other new classes/modules are defined in the namespace, separately from other (root/user) namespaces.
The newly defined classes/modules are top-level classes/modules in the main namespace like App
, but in optional namespaces, classes/modules are defined under the namespace (subclass of Module), like ns::App
.
In that namespace ns
, ns::App
is accessible as App
(or ::App
). There is no way to access App
in the main namespace from the code in the different namespace ns
.
Constants, class variables and global variables¶
Constants, Class variables of built-in classes and global variables are also separated by namespace. Values set to class/global variables in a namespace are invisible in other namespaces.
Methods and procs¶
Methods defined in a namespace run with the defined namespace, even when called from other namespaces.
Procs created in a namespace run with the defined namespace too.
Dynamic link libraries¶
Dynamic link libraries (typically .so files) are also loaded in namespaces as well as .rb files.
Open class (Changes on built-in classes)¶
In user namespaces, built-in class definitions can be modified. But those operations are processed as copy-on-write of class definition from the root namespace, and the changed definitions are visible only in the (user) namespace.
Definitions in the root namespace are not modifiable from other namespaces. Methods defined in the root namespace run only with root-namespace definitions.
Enabling Namespace¶
Specify RUBY_NAMESPACE=1
environment variable when starting Ruby processes. 1
is the only valid value here.
Namespace feature can be enabled only when Ruby processes start. Setting RUBY_NAMESPACE=1
after starting Ruby scripts performs nothing.
Pull-request¶
Updated by baweaver (Brandon Weaver) about 5 hours ago
As a proof of concept this is a very valuable idea, and will give users a chance to experiment with it.
I wonder about the long-term ergonomics of this though, and if it may make sense to potentially introduce in Ruby 4 a new keyword for namespace
that is stronger than module
for wrapping:
namespace NamespaceOne
require "./app1"
end
namespace NamespaceTwo
require "./app2"
end
p NamespaceOne::App.port # 2048
p NamespaceOne::App.val # "2048"
p NamespaceTwo::App.port # 4096
p NamespaceTwo::App.val # "8192"
A require
that is run inside of a namespace
could serve the same function mentioned above, but could additionally provide an isolate environment for defining other code:
namespace Payrolls
class Calculator; end
private class RunTaxes; end
end
Payrolls::Calculator # can access
Payrolls:RunTaxes # raises violation error
namespace Payments
class RecordTransaction; end
end
For Ruby 3.x I would agree that the proposed syntax is good for experimentation, but would ask that we consider making this a top-level concept in Ruby 4.x with a namespace
keyword to fully isolate wrapped state.
Updated by fxn (Xavier Noria) about 4 hours ago
· Edited
A few quick questions:
Assuming a normal execution context, nesting at the top level of a file is empty. Would it be also empty if the file is loaded under a namespace?
The description mentions classes and modules, which is kind of intuitive. They are relevant because they are the containers of constants. But, as we know, constants can store anything besides class and module objects. In particular, constants from the root namespace, recursively, can store any kind of object that internally can refer to any other object. There is a graph of pointers.
So, when a namespace is created, do we have to think that the entire object tree is deep cloned? (Maybe with CoW, but conceptually?) For example, let's imagine C::X
is a string in the root namespace, and we create ns
. Would ns::C::X.clear
clear the string in both namespaces?
Global variables stay global I guess?
Updated by tagomoris (Satoshi Tagomori) about 3 hours ago
@baweaver I don't have strong opinion about adding namespace
keyword, but having a block parameter on Namespace.new
could provide similar UX without changing syntax.
NamespaceOne = Namespace.new do
require "./app1"
end
p NamespaceOne::App.port #=> 2048
This looks a less smart but may not worst. Having Kernel#namespace
could be an alternative idea.
NamespaceOne = namespace do
require "./app1"
end
Updated by tagomoris (Satoshi Tagomori) about 2 hours ago
fxn (Xavier Noria) wrote in #note-2:
A few quick questions:
Assuming a normal execution context, nesting at the top level of a file is empty. Would it be also empty if the file is loaded under a namespace?
Yes. At that time, self
will be a cloned (different) object from main
in optional namespaces.
So, when a namespace is created, do we have to think that the entire object tree is deep cloned? (Maybe with CoW, but conceptually?)
Conceptually, yes. Definitions are deeply cloned. But objects (stored on constants, etc) will not be cloned (See below).
For example, let's imagine
C::X
is a string in the root namespace, and we createns
. Wouldns::C::X.clear
clear the string in both namespaces?
Yes. (I hope built-in classes/modules don't have such mutable objects, but those should have :-( )
Global variables stay global I guess?
Global variables are also separated by namespace. Imagine $LOAD_PATH and $LOADED_FEATURES that have different sets of load paths and actually loaded file paths, which should be different from each other namespace.
Providing protection for unexpected changes of global variables by libraries or other apps is a part of namespace concept.
Updated by tagomoris (Satoshi Tagomori) about 2 hours ago
- Description updated (diff)
Updated by fxn (Xavier Noria) about 1 hour ago
· Edited
Thanks @tagomoris.
Conceptually, yes. Definitions are deeply cloned. But objects (stored on constants, etc) will not be cloned (See below).
Let me understand this one better.
In Ruby, objects are stored in constants. Conceptually, a constant X
storing a string object and a constant C
storing a class object are not fundamentally different. Do you mean namespace creation traverses constant trees, clones only the values that are class and module objects, and keeps the rest of object references, which become shared between namespaces?
Even in the case of classes and modules, what happens to the objects in their ivars?
I do not know about builtin, but in the case of user-defined classes/modules, I don't think we can assume they do not mutate their state. We could have 2500 of them in the root namespace when the namespace is created.
Updated by tagomoris (Satoshi Tagomori) 23 minutes ago
fxn (Xavier Noria) wrote in #note-6:
In Ruby, objects are stored in constants. Conceptually, a constant
X
storing a string object and a constantC
storing a class object are not fundamentally different. Do you mean namespace creation traverses constant trees, clones only the values that are class and module objects, and keeps the rest of object references, which become shared between namespaces?
For example, String
is a built-in class and Class
object value, stored as ::String
constant. And in a namespace ns1
, we can change String
definition (for example, adding a constant String::X = "x"
).
But even in that case, the value of String
is identical. ::String == ns1::String
returns true.
That means, the value (VALUE
in CRuby world) is identical and not copied when namespaces are created, but the backed class definition (struct rb_classext_t) are different and those are the CoW target.
Even in the case of classes and modules, what happens to the objects in their ivars?
Class ivars (instance variable tables of classes) are copied, but the ivar values are not copied. It's similar to constants (constant tables) of classes.
I do not know about builtin, but in the case of user-defined classes/modules, I don't think we can assume they do not mutate their state. We could have 2500 of them in the root namespace when the namespace is created.
In the namespace context, "builtin classes/modules" are classes and modules defined before any user-script evaluation. (I'll update the ticket description soon.)
The total number of those are, classes 685, modules 40 (and internal iclass 51). any user-defined classes/modules are not defined in the root namespace.
Updated by tagomoris (Satoshi Tagomori) 22 minutes ago
- Description updated (diff)
Updated by fxn (Xavier Noria) 7 minutes ago
any user-defined classes/modules are not defined in the root namespace.
Ah, that is key.
So, what happens in this script?
# main.rb
App = Class.new
ns1 = Namespace.new
ns1.require("./app1") # defines/reopens App
do App
and ns1::App
have the same object ID?
Or does the feature assume that if you want to isolate things that has to be the first thing before creating any constant, global variable, etc.?
Updated by byroot (Jean Boussier) 6 minutes ago
having a block parameter on Namespace.new could provide similar UX without changing syntax.
That wouldn't handle constant definitions correctly though. Similar to how people get tricked by Struct.new do
today.
Foo = Struct.new(:bar) do
BAZ = 1 # This is Object::BAZ
end
That's why I filed [Feature #20993], it would allow you to do:
module MyNamespace = Namespace.new
BAZ = 1 # This is MyNamespace::BAZ
end