Project

General

Profile

Actions

Feature #18035

open

Introduce general model/semantic for immutability.

Added by ioquatix (Samuel Williams) over 3 years ago. Updated 10 months ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:104560]

Description

It would be good to establish some rules around mutability, immutability, frozen, and deep frozen in Ruby.

I see time and time again, incorrect assumptions about how this works in production code. Constants that aren't really constant, people using #freeze incorrectly, etc.

I don't have any particular preference but:

  • We should establish consistent patterns where possible, e.g.
    • Objects created by new are mutable.
    • Objects created by literal are immutable.

We have problems with how freeze works on composite data types, e.g. Hash#freeze does not impact children keys/values, same for Array. Do we need to introduce freeze(true) or #deep_freeze or some other method?

Because of this, frozen does not necessarily correspond to immutable. This is an issue which causes real world problems.

I also propose to codify this where possible, in terms of "this class of object is immutable" should be enforced by the language/runtime, e.g.

module Immutable
  def new(...)
    super.freeze
  end
end

class MyImmutableObject
  extend Immutable

  def initialize(x)
    @x = x
  end
  
  def freeze
    return self if frozen?
    
    @x.freeze
    
    super
  end
end

o = MyImmutableObject.new([1, 2, 3])
puts o.frozen?

Finally, this area has an impact to thread and fiber safe programming, so it is becoming more relevant and I believe that the current approach which is rather adhoc is insufficient.

I know that it's non-trivial to retrofit existing code, but maybe it can be done via magic comment, etc, which we already did for frozen string literals.

Proposed PR: https://github.com/ruby/ruby/pull/4879

Updated by ioquatix (Samuel Williams) over 3 years ago

  • Subject changed from Introduce general module for immutable by default. to Introduce general model/semantic for immutable by default.

Fix title.

Updated by duerst (Martin Dürst) over 3 years ago

This is mostly just a generic comment that may not be very helpful, but I can only say that I fully agree. Even before talking about parallel stuff (thread/fiber), knowing some object is frozen can be of help when optimizing.

One thing that might be of interest is that in a method chain, a lot of the intermediate objects (e.g. arrays, hashes) may be taken to be immutable because they are just passed to the next method in the chain and never used otherwise (but in this case, it's just the top object that's immutable, not necessarily the components, which may be passed along the chain).

Updated by ioquatix (Samuel Williams) over 3 years ago

Regarding method chains, one thing that's always bothered me a bit is this:

def foo(*arguments)
	pp object_id: arguments.object_id, frozen: arguments.frozen?
end

arguments = [1, 2, 3].freeze
pp object_id: arguments.object_id, frozen: arguments.frozen?

foo(*arguments)

I know it's hard to implement this and also retain compatibility, but I feel like the vast majority of allocations done by this style of code are almost always unneeded. We either need some form of escape/mutation analysis, or copy-on-write for Array/Hash (maybe we have it already and I don't know).

Actions #4

Updated by jeremyevans0 (Jeremy Evans) over 3 years ago

  • Tracker changed from Bug to Feature
  • Backport deleted (2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN)
Actions #5

Updated by Eregon (Benoit Daloze) over 3 years ago

  • Description updated (diff)

Updated by Eregon (Benoit Daloze) over 3 years ago

Many things discussed in the description here.

I think it's important to differentiate shallow frozen (Kernel#frozen?) and deep frozen (= immutable), and not try to change their meaning.
So for example overriding freeze to deep freeze does not seem good.

There was a suggestion for deep_freeze in #17145, which IMHO would be a good addition.

Objects created by literal are immutable.

I don't agree, for instance [] and {} should not be frozen, that would just be counter-productive in many cases.

Maybe CONSTANT = value should .deep_freeze the value, this was discussed with Ractor.make_shareable but that was rejected (#17273).

There is also the question of how to mark a class as creating immutable objects.
And potentially still allow to subclass it, and what it should do with initialize_copy, allocate, etc.
That's illustrated with the Immutable above but otherwise not much discussed.
I think that's probably worth its own ticket, because it's a big enough subject of its own, I'll try to make one.

copy-on-write for Array

That's required for efficient Array#shift so you can assume it's there on all major Ruby implementations.

Updated by ioquatix (Samuel Williams) over 3 years ago

Here is a proposed PR which implements the very basics.

https://github.com/ruby/ruby/pull/4879

I'm sure we can expand it to include many of the discussed features (e.g. dup/clone).

Whether we call deep freeze is implementation specific - user can override def freeze to suit the needs of the object.

Updated by Eregon (Benoit Daloze) over 3 years ago

Immutable means deeply frozen to me, not just shallow frozen (which is just Kernel#frozen?).

Updated by Dan0042 (Daniel DeLorme) over 3 years ago

I don't care much for this 'immutable' stuff, but as long as no backward compatibility is introduced (like making literals immutable) then I don't mind.

If immutable means deep frozen, one concern I have is about side effects of immutability. If I do MyImmutableObject.new(array) I might not expect array to suddenly become frozen. Should the definition of immutability include making a deep copy of non-immutable objects?

Updated by maciej.mensfeld (Maciej Mensfeld) about 3 years ago

Should the definition of immutability include making a deep copy of non-immutable objects?

Deep copy or frozen args requirement in the first place.

I think this is a feature that for advanced users can bring a lot of benefits. Many things in dry ecosystem are forfrozen when built (contract classes for example). Could we extend it to include also class and module definitions, so once initialized, cannot be changed?

On one side it would break monkey-patching but at the same time would allow easier (more verbose?) expression of "this is not to be touched" when building libs.

Updated by jeremyevans0 (Jeremy Evans) about 3 years ago

@maciej.mensfeld (Maciej Mensfeld) alluded to this already, but one thing to consider is that no object in Ruby is truly immutable unless all entries in object.singleton_class.ancestors are also frozen/immutable. Additionally, depending on your definition of immutable, you may want all constants referenced by any method defined in any of object.singleton_class.ancestors to also be frozen/immutable.

Updated by tenderlovemaking (Aaron Patterson) about 3 years ago

jeremyevans0 (Jeremy Evans) wrote in #note-11:

@maciej.mensfeld (Maciej Mensfeld) alluded to this already, but one thing to consider is that no object in Ruby is truly immutable unless all entries in object.singleton_class.ancestors are also frozen/immutable.

Are they not? It seems like for Arrays they are (I haven't checked other types), so maybe there's some precedent:

x = [1, 2, 3].freeze

Mod = Module.new { def foo; end }

begin
  x.extend(Mod)
rescue FrozenError
  puts "can't extend"
end

begin
  def x.foo; end
rescue FrozenError
  puts "can't def"
end

begin
  y = x.singleton_class
  def y.foo; end
rescue FrozenError
  puts "can't def singleton"
end

Updated by jeremyevans0 (Jeremy Evans) about 3 years ago

tenderlovemaking (Aaron Patterson) wrote in #note-12:

jeremyevans0 (Jeremy Evans) wrote in #note-11:

@maciej.mensfeld (Maciej Mensfeld) alluded to this already, but one thing to consider is that no object in Ruby is truly immutable unless all entries in object.singleton_class.ancestors are also frozen/immutable.

Are they not? It seems like for Arrays they are (I haven't checked other types), so maybe there's some precedent:

x = [1, 2, 3].freeze

Mod = Module.new { def foo; end }

begin
  x.extend(Mod)
rescue FrozenError
  puts "can't extend"
end

begin
  def x.foo; end
rescue FrozenError
  puts "can't def"
end

begin
  y = x.singleton_class
  def y.foo; end
rescue FrozenError
  puts "can't def singleton"
end

Apologies for not being more clear. Freezing an object freezes the object's singleton class. However, it doesn't freeze the other ancestors in singleton_class.ancestors:

c = Class.new(Array)
a = c.new
a << 1
a.first # => 1
c.define_method(:first){0}
a.first # => 0

As this shows, an instance of a class is not immutable unless its class and all other ancestors of the singleton class are immutable.

Updated by tenderlovemaking (Aaron Patterson) about 3 years ago

jeremyevans0 (Jeremy Evans) wrote in #note-13:

tenderlovemaking (Aaron Patterson) wrote in #note-12:

jeremyevans0 (Jeremy Evans) wrote in #note-11:

@maciej.mensfeld (Maciej Mensfeld) alluded to this already, but one thing to consider is that no object in Ruby is truly immutable unless all entries in object.singleton_class.ancestors are also frozen/immutable.

Are they not? It seems like for Arrays they are (I haven't checked other types), so maybe there's some precedent:

x = [1, 2, 3].freeze

Mod = Module.new { def foo; end }

begin
  x.extend(Mod)
rescue FrozenError
  puts "can't extend"
end

begin
  def x.foo; end
rescue FrozenError
  puts "can't def"
end

begin
  y = x.singleton_class
  def y.foo; end
rescue FrozenError
  puts "can't def singleton"
end

Apologies for not being more clear. Freezing an object freezes the object's singleton class. However, it doesn't freeze the other ancestors in singleton_class.ancestors:

c = Class.new(Array)
a = c.new
a << 1
a.first # => 1
c.define_method(:first){0}
a.first # => 0

As this shows, an instance of a class is not immutable unless its class and all other ancestors of the singleton class are immutable.

Ah right. I think your example is missing a freeze, but I get it. If freezing an instance were to freeze all ancestors of the singleton, wouldn't that extend to Object / BasicObject? I feel like we'd have to stop freezing somewhere because it would be pretty surprising if you can't define a new class or something because someone did [].freeze. Simple statements like FOO = [1].freeze wouldn't work (as Object would get frozen before we could set the constant).

Maybe we could figure out a cheap way to copy things so that a mutation to the Class.new from your example wouldn't impact the instance a.

But regardless it seems like gradual introduction would be less surprising. IOW maybe the goal would be to make all references immutable, but that really isn't practical. Instead expand the frozen horizon as much as we can without breaking existing code?

Updated by ioquatix (Samuel Williams) about 3 years ago

I would like us to define a model for immutability that has real world use cases and applicability - i.e. useful to developers in actual situations rather than theoretically sound and impossible to implement or impossible to use. Not that I'm saying theoretically sound is not important, just that we have to, as was said above, stop somewhere. Since this proposal includes a new module, it's totally optional. But that module is semantically independent from how we actually implement some kind of deep_freeze. The point of the module is to enforce it in a visible way - as in, this class will always be frozen. Such a design can then be used by the interpreter for constant propagation, etc.

Updated by jeremyevans0 (Jeremy Evans) about 3 years ago

tenderlovemaking (Aaron Patterson) wrote in #note-14:

Ah right. I think your example is missing a freeze, but I get it. If freezing an instance were to freeze all ancestors of the singleton, wouldn't that extend to Object / BasicObject? I feel like we'd have to stop freezing somewhere because it would be pretty surprising if you can't define a new class or something because someone did [].freeze. Simple statements like FOO = [1].freeze wouldn't work (as Object would get frozen before we could set the constant).

Correct. Kernel#freeze behavior should not change anyway, it should continue to mean a shallow freeze. This does point out that a #deep_freeze method on an object doesn't result in true immutability. You would have to pick an arbitrary point in the class hierarchy unless you wanted it to freeze all classes. I don't like such an approach.

Maybe we could figure out a cheap way to copy things so that a mutation to the Class.new from your example wouldn't impact the instance a.

Copying method handles into a singleton is one simple idea, but I cannot see how that would work with super, and it would result in a significant performance decrease.

But regardless it seems like gradual introduction would be less surprising. IOW maybe the goal would be to make all references immutable, but that really isn't practical. Instead expand the frozen horizon as much as we can without breaking existing code?

I don't think we should change the semantics of Kernel#freeze. In regards to an Immutable module, I'm neither opposed to it nor in favor of it, but we should recorgnize that it would not be able to offer true immutability.

ioquatix (Samuel Williams) wrote in #note-15:

I would like us to define a model for immutability that has real world use cases and applicability - i.e. useful to developers in actual situations rather than theoretically sound and impossible to implement or impossible to use. Not that I'm saying theoretically sound is not important, just that we have to, as was said above, stop somewhere. Since this proposal includes a new module, it's totally optional. But that module is semantically independent from how we actually implement some kind of deep_freeze. The point of the module is to enforce it in a visible way - as in, this class will always be frozen. Such a design can then be used by the interpreter for constant propagation, etc.

I don't think Ruby necessarily has an immutability problem currently. You can override #freeze as needed in your classes to implement whatever frozen support you want, and freeze objects inside #initialize to have all instances be frozen (modulo directly calling allocate).

I have a lot of experience developing libraries that are designed to be frozen after application initialization. Both Sequel and Roda use this approach, either by default or as an option, and freezing results in significant performance improvements in both. I don't believe Ruby's current support for freezing objects is lacking, but I recognize that it could be made easier for users.

If you want to freeze the entire core class hierarchy, you can, and if you do it correctly, nothing breaks. I know this from experience as I run my production web applications with this approach (using https://github.com/jeremyevans/ruby-refrigerator). However, with this approach, the class/module freezing is not implicit due to instance freezing, it's explicit after an application is fully initialized, before accepting requests. The reason to do this is to ensure that nothing in your application is modifying the core classes at runtime.

Updated by ioquatix (Samuel Williams) about 3 years ago

Maybe we can collect use cases where such an approach makes sense.

@ko1 (Koichi Sasada) changed Process::Status to be frozen by default. What is the logic? What is the problem we are trying to solve by doing this? Is it to make things sharable by Ractor?

@Eregon (Benoit Daloze) asserted that we should make as many of the core classes frozen by default. What's the advantage of this?

@jeremyevans0 (Jeremy Evans) your general model makes sense to me and I admire your approach to freezing the runtime. Can you explain where the performance advantages come from? Also:

and freeze objects inside #initialize to have all instances be frozen

Doesn't this break sub-classes that perform mutable initialisation?

Updated by jeremyevans0 (Jeremy Evans) about 3 years ago

ioquatix (Samuel Williams) wrote in #note-17:

@jeremyevans0 (Jeremy Evans) your general model makes sense to me and I admire your approach to freezing the runtime. Can you explain where the performance advantages come from? Also:

Performance advantages come from two basic ideas:

  1. If objects are frozen, it opens up additional opportunities for caching them. With mutable objects, caching is very tricky. You can obviously clear caches when you detect mutation of the current object, but if you can mutate the objects held in the cache, then cache invalidation is very hard to get right. Purely immutable objects don't support internal caching, since an immutable cache is worthless, so this approach relies on let's say mostly immutable objects. Sequel datasets use this approach. They are always frozen, and take heavy advantage of caching to reduce the amount of work on repetitive calls. This is what allows you to have call chains like ModelClass.active.recent.first that do almost no allocation in subsequent calls in Sequel, as both the intermediate datasets and the SQL to use for the call are cached after the first call.

  2. When freezing a class, you can check if the default implementation of methods has not been modified by checking the owner of the method. If the default implementation of methods has not been modified, you can inline optimized versions for significantly improved performance. Roda uses this approach to improve routing and other aspects of its performance.

and freeze objects inside #initialize to have all instances be frozen

Doesn't this break sub-classes that perform mutable initialisation?

It doesn't break subclass initialization, as long as the subclass mutates the object in #initialize before calling super instead of after. Alternatively, you can have freeze in the subclass check for partial initialization, and finish initialization before calling super in that case.

Updated by ioquatix (Samuel Williams) about 3 years ago

I'm happy with the current PR which invokes #freeze after calling #new.

If we can't guarantee full immutability, is this sufficient? Independently we could look at providing #deep_freeze which I think classes would opt into as in def freeze; deep_freeze; end.

If we have a problem with the terminology, what about calling the module Frozen rather than Immutable? However, I think Immutable sends a clearer message about the intention.

Can we make this compatible with Ractor.make_shareable? I think that's a valid use case. As in, I think we should have an interface for immutability which does not depend on/is compatible with Ractor.

Updated by ioquatix (Samuel Williams) about 3 years ago

Looking at Ractor.make_shareable, wouldn't this implementation be a candidate for a potential #deep_freeze implementation? It seems really okay to me on the surface:

irb(main):005:0> Ractor.make_shareable(x)
=> [[1, 2, 3], [2, 3, 4]]
irb(main):006:0> x.frozen?
=> true
irb(main):007:0> x[0].frozen?
=> true

Updated by Eregon (Benoit Daloze) about 3 years ago

I think nobody expects #freeze or #deep_freeze to ever freeze (non-singleton) classes/modules, so IMHO these methods simply not attempt that (except SomeModule.freeze of course).
It's the difference between state (ivars, and values of these ivars) and behavior (methods of a class).

ioquatix (Samuel Williams) wrote in #note-17:

@Eregon (Benoit Daloze) asserted that we should make as many of the core classes frozen by default. What's the advantage of this?

I made an extensive list of immutable classes in core here, as well as the many advantages:
https://gist.github.com/eregon/bce555fa9e9133ed27fbfc1deb181573

I'll copy the advantages here:
Advantages:

  • No need to worry about .allocate-d but not #initialize-d objects => not need to check in every method if the object is #initialize-d
  • internal state/fields can be truly final/const.
  • simpler and faster 1-step allocation since there is no dynamic call to #initialize (instead of .new calls alloc_func and #initialize)
  • Known immutable by construction, no need for extra checks, no need to iterate instance variables since no instance variables
  • Potentially lower footprint due to no instance variables
  • Can be shared between Ractors freely and with no cost
  • Can be shared between different Ruby execution contexts in the same process and even in persisted JIT'd code
  • Easier to reason about both for implementers and users since there is no state
  • Can be freely cached as it will never change

There is a sub-category of classes with .allocate undefined or allocator undefined, and noop initialize, those only have the first 3 advantages, but still better than nothing.

The first advantage is I think quite important as it avoids needing to care about initialized checks for things like klass.allocate.some_method.

IMHO the most valuable advantages of immutable classes are that they are easier to reason about, but also they can be shared between Ractors, execution contexts in the same process (like V8 isolated contexts, I think JRuby also has those, it improves footprint and can improve warmup by JIT'ing once per process and not per context), and also in persisted JIT'd code.
Persisted JIT'd code is a feature being developed in TruffleRuby and it enables to save the JIT'ed code of a process and reuse it for the next processes.
For classes which have a literal notation, it's quite important they are immutable, otherwise one would need to reallocate one instance per execution context which feels clearly inefficient.

Given the many advantages, I think we should make more core classes immutable or classes with .allocate undefined or allocator undefined, and noop initialize, as much as possible.

To be shareable between execution contexts and persisted JIT'd code they need to have a well known class.
Subclassing is therefore not possible since Ruby classes are stateful.
Anyway it is highly discouraged to subclass core classes so I think that is not much of an issue.


For Process::Status, it's already in the immutable core classes, let's keep it that way.
I don't think making it subclassable is useful.
The way to create an instance for Ruby could be Process::Status.new(*args) and we override that .new to already freeze, or something like Process::Status(*args) or Process.status(*args).


Regarding making user classes immutable, I think one missing piece is this hardcoded list of immutable classes in Kernel#dup and Kernel#clone.
Overriding #dup and #clone in the user class works around it, but then it doesn't work for Kernel.instance_method(:clone).bind_call(obj) as that will actually return a mutable copy!
It's then possible to e.g. call initialize on that mutable copy to mutate it, which breaks the assumption of the author of the class.

So I think we need a way for a user class to define itself as immutable (extend Immutable is one way, could also be by defining MyClass.immutable?), and for Kernel#dup and Kernel#clone to then use that to just return self.
If a class is marked as immutable it should be guaranteed to be deeply frozen (otherwise it's incorrect to return self for dup/clone), so we should actually deep-freeze after the custom #freeze is called from .new:

def ImmutableClass.new(*args)
  obj = super(*args)
  obj.freeze
  Primitive.deep_freeze(obj) # not a call, some known function of the VM
end

That way we can know this object is truly immutable from the runtime point of view as well and e.g., can be passed to another Ractor.
Primitive.deep_freeze(obj) would set a flag on the object so it's fast to check if the object is immutable later on.

Updated by Eregon (Benoit Daloze) about 3 years ago

I forgot to mention, it's also much better if all instances of a class (and potential subclasses) are immutable, if only part of the instances it's quickly confusing and most of the advantages disappear as the class is no longer truly immutable.
This is currently the case for Range and Regexp, I think and we should solve that by making all Range&Regexp instances frozen not just literals.
For String we probably need to keep it supporting both mutable and immutable for compatibility.

Updated by ko1 (Koichi Sasada) about 3 years ago

ioquatix (Samuel Williams) wrote in #note-17:

@ko1 (Koichi Sasada) changed Process::Status to be frozen by default. What is the logic? What is the problem we are trying to solve by doing this? Is it to make things sharable by Ractor?

Yes.

Updated by Eregon (Benoit Daloze) about 3 years ago

Some notes from the discussion on Slack:

In general shareable (from Ractor.make_shareable) != immutable, notably for special shareable objects like Ractor, and potentially Ractor::Array (which would share any element) or so. So the deep_frozen/immutable flag should be something different.
For the special case of Module/Class I believe nobody wants them frozen for obj.deep_freeze or so, so that's a case of shareable and "deep freezing/immutable" agree, but not for other special-shareable objects.
Immutable does imply shareable, but not the other way around.

Re half-frozen if some override of freeze raises an exception, that's fine, the bug should be fixed in the bad freeze, it was already discussed extensively when adding Ractor.make_shareable.
The exception will prevent freeze/deep_freeze/Immutable#new to return which is good enough to indicate the problem.

Implementation-wise, I guess we can reuse quite a bit from Ractor.make_shareable, but it needs to be a different flag, and it needs to freeze more than Ractor.make_shareable (absolutely everything except Module/Class).

Concrete example of shareable but not immutable:

$ ruby -e 'module A; end; module B; C=A; Ractor.make_shareable(self); freeze; end; p Ractor.shareable?(B); p B.frozen?; p A.frozen?; p B.immutable?'
true
true
false
false

Updated by ioquatix (Samuel Williams) about 3 years ago

After discussing it, we found some good ideas regarding immutability in Ruby.

Firstly, we need some definition for immutability. There appear to be two options.

(1) We can assume that RB_FL_FREEZE | RB_FL_SHAREABLE implies immutable objects. RB_FL_FREEZE implies the current object is frozen and RB_FL_SHAREABLE implies it was applied recursively (since only a completely frozen object-graph is considered shareable). The nice thing about this definition is that it works with existing objects that are frozen and shareable, including any sub-object within that.

(2) We can introduce a new flag, e.g. RB_FL_IMMUTABLE which implies immutable objects. RB_FL_IMMUTABLE would need to be applied and cached, and even if we introduced such a flag, unless care was taken, it would not be compatible with the frozen/shareable objects implicitly.

The biggest issue with (1) is that rb_ractor_shareable_p can update the RB_FL_SHAREABLE flag even though it presents the interface of being a predicate check (https://bugs.ruby-lang.org/issues/18258). We need to consider this carefully.

Regarding the interface changes, I see two important ones:

(1) Introduce module Immutable which is used by class to create immutable instances. Implementation is given already.

(2) Introduce def Immutable(object) which returns an immutable object (as yet not discussed).

Let's aim to implement (1) as discussed in this proposal, and we can discuss (2) later.

Regarding (1), there are only a few issues which need to be addressed, notably how #dup and #clone behave. For any immutable object, we need to be concerned about whether #dup and #clone can return mutable versions or not. We can check the current behaviour by assuming Ractor.make_shareable(T_OBJECT) returns immutable objects.

(1) Implementation of #dup. Traditional implementation returns mutable copy for normal object, and self for "immutable" objects. Overall, there are three possible implementations:

# (1)
immutable.dup()               # returns frozen, but not sharable (current)

# (2)
immutable.dup()               # raises an exception

# (3)
immutable.dup()               # returns self

Right now, we have some precedent (implemented in object.c#special_object_p), e.g.

5.dup # still immutable, return self.
:footer.dup # still immutable, return self.

Therefore, for immutable objects (3) makes the most sense (also most efficient option).

(2) Implementation of #clone(freeze: true/false). It's similar to #dup but argument must be considered:

# (1)
immutable.clone()               # returns frozen, but not sharable (current)
immutable.clone(freeze: false) # returns unfrozen, not sharable (current)

# (2)
immutable.clone()               # returns frozen, and sharable
immutable.clone(freeze: false) # raises an exception

# (3)
immutable.clone()               # returns self
immutable.clone(freeze: false) # raises an exception

There is a precedent for this too:

irb(main):001:0> :foo.clone(freeze: true)
=> :foo
irb(main):002:0> :foo.clone(freeze: false)
<internal:kernel>:48:in `clone': can't unfreeze Symbol (ArgumentError)

I believe (3) is the right approach because once a user object is frozen, un-freezing the top level is unlikely to create a working object, since all the internal state would still be frozen.

Updated by Eregon (Benoit Daloze) about 3 years ago

The above sounds good.
Note that RB_FL_FREEZE | RB_FL_SHAREABLE is not enough, it should be a new flag, I detailed in https://bugs.ruby-lang.org/issues/18035#note-24.

Re (2) I think .deep_freeze would be a good option too. Immutable(obj) looks to me like it might create a wrapper or so.

Re clone/dup I agree, just same behavior as Symbol and Integer is good.
It's pointless to have a mutable copy for immutable objects, and immutable classes should have all instances immutable, always.

Updated by Eregon (Benoit Daloze) about 3 years ago

Use cases:

  • Actually have a way to express immutable/deeply frozen in Ruby. Right now there is no way, and freeze is only shallow.
  • Immutability has many benefits, easier to reason about, no data races possible, clarifies a lot the behavior of a class, possible to cache safely based on immutable keys, etc.
  • Right now immutability is only for core classes, but of course it is useful for user code too.

Many proposals for deep_freeze were already created, showing people want it:

Updated by Eregon (Benoit Daloze) about 3 years ago

Re naming, I discussed with @ioquatix (Samuel Williams) and I think this is consistent and makes sense:

class MyClass
  extend Immutable
end

MyClass.new.immutable? # true

MY_CONSTANT = Immutable [1, 2, [3]]

MY_CONSTANT.immutable? # true

And for the pragma:

# immutable_constants: true
or
# immutable_constant_value: true

(similar to # shareable_constant_value: but easier to type and to understand the semantics)

I like .deep_freeze but it would be less consistent and it could cause more issues if ice_nine is require-d.
obj.immutable doesn't seem clear, and it could be useful to make non-Kernel objects also immutable, hence Immutable(obj).

Updated by ioquatix (Samuel Williams) about 3 years ago

I tried a couple of times but I could not figure out how to allocate more flags within RBasic and enum in C99 has problems with more than 32-bit constants on some platforms.

So, I'd like to share some ideas.

Can we split flags into two parts:

struct RBasic {
  union {
    VALUE flags;
    struct {
      uint32_t object_flags; // well defined object flags for object state.
      uint32_t internal_flags; // internal flags for specific object implementation (currently "user" flags).
    };
  }
};

I don't know if this can work but maybe it's one idea to make the usage of flags more explicit. I suggest we move "user flags" to 2nd 32-bit "internal flags". We shouldn't bother exposing any constants for this, it's reserved depending on T_... type.

@nobu (Nobuyoshi Nakada) do you think this is something worth exploring? cc @shyouhei (Shyouhei Urabe) do yo have any ideas about what we can do to make this better?

Updated by nobu (Nobuyoshi Nakada) about 3 years ago

What to do on 32-bit platforms?
And you seem to assume little endian.

Updated by ioquatix (Samuel Williams) about 3 years ago

What to do on 32-bit platforms?

I wondered about this. If we don't have 64-bits in general, the only way to implement this is to either:

  1. Reuse existing flag if possible.
  2. Reduce number of user flags.
  3. Only allow some kind of objects to be immutable, e.g. T_OBJECT and find unused user flag.

And you seem to assume little endian.

It's just PoC not fully working code :)

Updated by Dan0042 (Daniel DeLorme) about 3 years ago

ioquatix (Samuel Williams) wrote in #note-29:

So, I'd like to share some ideas.

Can we split flags into two parts

I've also thought about this in the past so I'd like to share my own observation; it seems like RHash uses a lot of FL_USER* bits. Like 4 bits each for RHASH_AR_TABLE_SIZE_MASK and RHASH_AR_TABLE_BOUND_MASK. If these could be moved to a separate field in the RHash struct, that would leave a lot of freedom to add flags within 32 bits.

Updated by Dan0042 (Daniel DeLorme) about 3 years ago

Question for the immutability-minded folks here. What would you think of an Immutable refinement like:

module Immutable
  refine String do
    def +(other)
      super.freeze
    end
    def <<(other)
      raise ImmutableError
    end
  end
  refine Array.singleton_class do
    def new(...)
      super(...).freeze
    end
  end
end

using Immutable

"a" + "b"      #=> frozen string
"a".dup << "b" #=> ImmutableError; String#<< forbidden
Array.new      #=> frozen array
[1]            #=> frozen array? would require some work to make this possible

Updated by Eregon (Benoit Daloze) about 3 years ago

Re flags, can we just remove the highest FL_USER flags and use it to mark immutable instead?
Those FL_USER flags are actually not meant to be used by gems BTW (#18059).

Updated by Eregon (Benoit Daloze) about 3 years ago

@Dan0042 (Daniel DeLorme) I think you can experiment with that in a gem if you'd like.
I wouldn't call the module Immutable to avoid mixing the two concepts.

I think concretely it would be hard and not necessarily useful to use that refinement for existing code.
I certainly don't want all Ruby core objects to be always immutable, mutability can very useful and far more expressive and efficient than immutable in some cases (e.g., Array vs cons list).

Actions #36

Updated by ioquatix (Samuel Williams) 12 months ago

  • Subject changed from Introduce general model/semantic for immutable by default. to Introduce general model/semantic for immutability.
Actions #37

Updated by ioquatix (Samuel Williams) 12 months ago

  • Description updated (diff)

Updated by ioquatix (Samuel Williams) 12 months ago

There is a PR here that mostly just works: https://github.com/ruby/ruby/pull/4879

However, additional work would be required to:

  1. Tidy up the implementation and
  2. Mark some core classes as Immutable as outlined in https://gist.github.com/eregon/bce555fa9e9133ed27fbfc1deb181573

Updated by matheusrich (Matheus Richard) 12 months ago

I'm not sure this was already mentioned, but this feels related to 3.2's Data class.

If this is accepted, should Data classes all be immutable? To quote the docs:

Data provides no member writers, or enumerators: it is meant to be a storage for immutable atomic values. But note that if some of data members is of a mutable class, Data does no additional immutability enforcement:

Event = Data.define(:time, :weekdays)
event = Event.new('18:00', %w[Tue Wed Fri])
#=> #<data Event time="18:00", weekdays=["Tue", "Wed", "Fri"]>

# There is no #time= or #weekdays= accessors, but changes are
# still possible:
event.weekdays << 'Sat'
event
#=> #<data Event time="18:00", weekdays=["Tue", "Wed", "Fri", "Sat"]>

Updated by Dan0042 (Daniel DeLorme) 12 months ago

I understand the advantages of immutability, but I worry about the overall direction this is taking us. So I'll just quote something @mame (Yusuke Endoh) wrote previously, as it reflects my own opinion.

mame (Yusuke Endoh) wrote in #16153#note-6:

I'm against making Ruby immutable by default. One of the most important properties of Ruby is dynamics. Ruby has allowed users to change almost anything in run time: dynamically (re)defining classes and methods (including core builtin ones), manipulating instance variables and local variables (via Binding) through encapsulation, etc. These features are not recommended to abuse, but they actually bring big flexibility: monkey-patching, DRY, flexible operation, etc. However, blindly freezing objects may spoil this usefulness.

It is a bit arguable if this flexibility limits performance. Some people say that it is possible to make Ruby fast with the flexibility kept (TruffleRuby proves it). That being said, I admit that the flexibility actually limits performance in the current MRI, and that we have no development resource to improve MRI so much in near future. I think that my proposal #11934 was one possible way to balance the flexibility and performance. Anyway, we need to estimate how much Ruby can be faster if the flexibility is dropped. If it becomes 10 times faster, it is worth considering of course. If it becomes 10 percent faster, it does not pay (in my opinion).

Updated by ioquatix (Samuel Williams) 11 months ago

It's not just about performance, it's about providing strong interface guarantees and making sure users don't violate those guarantees accidentally. e.g.

THINGS = [1, 2, 3]

def things
  return THINGS
end

# user code:
my_things = things
my_things << 4 # whoops - modified constant.

We saw this exact type of bug In Rack middleware, e.g.

NOT_FOUND = [404, {}, ["Not Found"]]

# config.ru

use Rack::ContentLength # This will mutate the constant.

run do 
  return NOT_FOUND
end

We "fixed" this in the Rack 3 spec, by making it clear the response could be mutated. Having a model for immutability would better enforce some of these expectations (avoid bugs) and potentially allow us to do things more efficiently, e.g. detect immutable responses and dup if needed.

Updated by ioquatix (Samuel Williams) 11 months ago

If this is accepted, should Data classes all be immutable? To quote the docs:

By default, it does not look like Data is considered immutable as nested values can mutate.

However, this proposal would allow for immutable Data instances, e.g.

irb(main):001> Person = Data.define(:name, :age, :hobbies)
=> Person
irb(main):002> bob = Person.new("Bob", 20, ["Ruby"])
=> #<data Person name="Bob", age=20, hobbies=["Ruby"]>
irb(main):003> bob.hobbies << "Fishing"
=> ["Ruby", "Fishing"]
irb(main):004> bob
=> #<data Person name="Bob", age=20, hobbies=["Ruby", "Fishing"]>
irb(main):005> Immutable bob
=> #<data Person name="Bob", age=20, hobbies=["Ruby", "Fishing"]>
irb(main):006> bob.hobbies << "Biking"
(irb):6:in `<main>': can't modify frozen Array: ["Ruby", "Fishing"] (FrozenError)
	from <internal:kernel>:187:in `loop'
	from -e:1:in `<main>'

Updated by ioquatix (Samuel Williams) 10 months ago

I encountered this issue recently:

https://bugs.ruby-lang.org/issues/17159

I think immutability would cover these types of problems too. I think it would be worthwhile discussing the merits of shareability vs immutability as a baseline model that we expect people to build on.

Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0