Feature #8909
closedExpand "f" frozen suffix to literal arrays and hashes
Added by headius (Charles Nutter) about 11 years ago. Updated almost 11 years ago.
Description
The "f" suffix to declare a frozen string was recently accepted into 2.1, and I think it's a spectacular addition. I would very much like to see it work for literal arrays and hashes too:
[1, 2, 3, 4, 5]f
{foo: 1, bar: 2, baz: 3}f
There are many, many cases where this could reduce allocation (frozen array with literal elements would only need to be allocated once) and improve thread-safety (explicitly create frozen arrays and hashes when creating structures that might be used across threads).
Is there any reason why we could not do this? I believe both of the above syntaxes would be invalid today, as was the case with the String "f" suffix, and hopefully that means the work to add this syntax would be similar.
Updated by headius (Charles Nutter) about 11 years ago
A couple questions:
- If you have literal arrays/strings/hashes within a frozen literal array/hash, should the "f" suffix do a "deep freeze" or do you have to specify each one?
in other words, if I want this whole thing to be frozen, do I need to do:
[[1, 2]f, {foo: 3}f, "blah"f]f
or
[[1, 2], {foo: 3}, "blah"]f
I can see pros and cons both ways.
-
Is there any problem with "f" suffixes always returning the same object when the contents are the same and also immutable? I have not looked at the implementation in MRI for "foo"f, but it should be totally ok for it to always return the same object. [:foo, 1, 2.0]f should be the same thing.
-
Special syntax for frozen empty array
A great number of wasted Ruby objects are caused by needing to simply pass or return an empty array. Because Arrays are mutable by default, [] always has to return a new object. The same goes for hashes.
For example, in method definitions:
def foo(opts = {}) # new hash created every time and probably doesn't need to be mutable
def bar(args = []) # new array created every time and probably doesn't need to be mutable
So with this proposal, we'd have []f and {}f, which aren't too bad...but I wanted to throw out the possibility of some other magic syntax. I have no strong preference or ideas.
Updated by Anonymous about 11 years ago
- Is there any problem with "f" suffixes always returning the same object when the contents are the same and also immutable?
CRuby already does this for f-suffixed strings. See r42843 and r42847.
Updated by headius (Charles Nutter) about 11 years ago
charliesome (Charlie Somerville) wrote:
CRuby already does this for f-suffixed strings. See r42843 and r42847.
Excellent...as I hoped.
Updated by matz (Yukihiro Matsumoto) about 11 years ago
"string" is a literal, but {a: 1} and [1, 2] are expressions.
Thus we have to define {}f and []f to be shallow freezing or deep freezing.
If it's deep freezing, freezing objects (not literals) could be troublesome.
We have to define a concrete behavior first to discuss.
Matz.
Updated by Hanmac (Hans Mackowiak) about 11 years ago
what about []f for shallow-freezing and []df for deep freezing? ;D
Updated by matz (Yukihiro Matsumoto) about 11 years ago
@Hanmac (Hans Mackowiak), Currently Ruby does not have object traversal API (except for marshalling and GC, both are not disclosed to Ruby level).
Deep freeze is far more difficult than you might expect.
Matz.
Updated by matz (Yukihiro Matsumoto) about 11 years ago
Options I can provide are:
- shallow freeze
- shallow freeze, and raises error if elements are not literals
- deep freeze, and raises error if elements are static freezable (i.e. numbers, strings, hashes and arrays)
Pick one, or propose alternative, if you want frozen suffix for hashes and arrays.
It's kind of trade off between complexity and usability.
Matz.
Updated by headius (Charles Nutter) about 11 years ago
@matz (Yukihiro Matsumoto) At this point I'm leaning toward just doing a simple shallow freeze. It would be up to the user to put things into the array that are themselves literals or frozen (or perhaps the user doesn't necessarily want those contents to be frozen). I'm seeing this as mostly a quick way to create a frozen array so a consumer (API, library, etc) won't be changing the array contents. It's your responsibility to make the array's contents' contents are also safe, if you want them to be.
I know folks may cry foul because they have always wanted deep freezing, but there is currently no deep freeze in Ruby. Shoehorning it into []f and {}f seems like it would be putting the cart before the horse.
Updated by phluid61 (Matthew Kerwin) about 11 years ago
+1 to shallow freeze
Updated by headius (Charles Nutter) about 11 years ago
FWIW, I would still optimize frozen-arrays-of-literals to always return the same object, and frozen empty arrays and hashes would be the same object everywhere, but the expression nature of the array elements means this would be an unspecified characteristic of the feature.
Updated by matz (Yukihiro Matsumoto) about 11 years ago
- Status changed from Open to Feedback
@hedius I don't think same object optimization coexists with mere shallow freezing.
If we restrict elements in static frozen containers to be static freezable, it would be possible.
But it makes implementation more complex. It's trade off.
Matz.
Updated by headius (Charles Nutter) about 11 years ago
@matz (Yukihiro Matsumoto) What I mean is that all of the following could safely be optimized to return the same object every time:
[]f
{}f
[:foo, 1, 1.5, true, nil]
I cannot decide whether restricting the elements to be literals and/or statically frozen is a good idea or not. For example, something like this seems like it should be acceptable:
{cache: ThreadSafe::Hash.new}
It wouldn't be able to optimize to be the same object every time, but it would still be a frozen Hash and nobody downstream could change the keys/values it stores.
So I guess the question is whether shallow freeze with potentially mutable elements is good or not.
Pros and cons:
- Shallow freeze, no restrictions
- simplest to implement
- most flexible
- no side effects
- mutable elements are still mutable
- user may expect elements to freeze too
- Shallow freeze, only frozen or literals as elements
- prevents unintentional exposure of mutable elements in frozen collection
- no side effects due to deep freezing
- less flexible
- user may want elements to be unfrozen
- more complicated to implement
- frozen elements may have unfrozen elements
- Deep freeze
- Guarantees everything inside frozen collection is deep-frozen
- May reflect frozen modifier more clearly
- Less typing to freeze all elements
- Most complicated and no existing functionality
- Array/Hash creation would deep freeze objects as side effect
- May have unintended consequences
I'm still leaning toward #1.
Updated by headius (Charles Nutter) about 11 years ago
We can probably agree that the deep freezing version is wrong, since it could have the side effect of freezing objects far away from the array/hash. There are no similar side effects for creating an array/hash today.
Updated by nobu (Nobuyoshi Nakada) about 11 years ago
(13/09/16 3:29), headius (Charles Nutter) wrote:
@matz (Yukihiro Matsumoto) What I mean is that all of the following could safely be optimized to return the same object every time:
[]f
{}f
[:foo, 1, 1.5, true, nil]
Note that we have to freeze true, false, and nil too, for the last
expression.
Updated by headius (Charles Nutter) about 11 years ago
On Mon, Sep 16, 2013 at 3:53 AM, Nobuyoshi Nakada nobu@ruby-lang.org wrote:
(13/09/16 3:29), headius (Charles Nutter) wrote:
@matz (Yukihiro Matsumoto) What I mean is that all of the following could safely be optimized to return the same object every time:
[]f
{}f
[:foo, 1, 1.5, true, nil]Note that we have to freeze true, false, and nil too, for the last
expression.
true, false, and nil exist as unique objects, so whether they're
frozen or not does not matter here. Two arrays that contain the same
objects and which cannot be modified are equivalent.
That said, I have and will continue to support all literal,
naturally-immutable values being frozen by default (Symbol, Fixnum,
Float, Bignum, true, false, nil) and values that should be immutable
being frozen as well (Time, Rational, Complex, BigDecimal).
- Charlie
Updated by headius (Charles Nutter) about 11 years ago
nobu (Nobuyoshi Nakada) wrote:
(13/09/16 3:29), headius (Charles Nutter) wrote:
@matz (Yukihiro Matsumoto) What I mean is that all of the following could safely be optimized to return the same object every time:
[]f
{}f
[:foo, 1, 1.5, true, nil]Note that we have to freeze true, false, and nil too, for the last
expression.
Updated by headius (Charles Nutter) about 11 years ago
Any other thoughts on this? Any other reasons why it shouldn't be done?
There are other advantages to having literal frozen array and hash:
- Alternative representation of small hashes knowing they won't need to change. Good for smaller footprint on hash args, for example.
- Small arrays could pack elements into the header (this may be happening already, but there would be less need to recover as it grows).
And the already-discussed advantages:
- []f and {}f would be the same object everywhere, making for cheap argument defaults.
- Assurance that your array or hash won't have different elements after passing to another piece of code.
- Returning the same object when all elements are literals
Updated by bascule (Tony Arcieri) about 11 years ago
I think there's a big opportunity here for immutable collection types... at least the sort that you know at the time you declared it is immutable.
I would hate to see a syntax like this that doesn't guarantee a deeply frozen data structure, since immutable data is quite difficult to use in Ruby at present, and really I've never found frozen data structures to be particularly useful since you can never be sure if they're immutable all the way down. Only then are you able to create immutable persistent data structures ala Ruby libraries like Hamster.
I think there's a way to ensure this is the case without requiring an object traversal system. Let's go back to what Charlie was saying:
if I want this whole thing to be frozen, do I need to do:
[[1, 2]f, {foo: 3}f, "blah"f]f
What if this were the only option, i.e. if we tried to do the other thing Charlie asked about:
[[1, 2], {foo: 3}, "blah"]f
...it raised an exception, because you asked it to make an immutable array that contained mutable contents?
This check could even be shallow and still be useful, IMO, provided we were vigilant about how things got frozen in the first place. However...
Is there any problem with "f" suffixes always returning the same object when the contents are the same and also immutable?
For this to work the VM would need to be in some sense aware that it's frozen objects all the way down. Isn't this state that the VM can track? What if you were simply "unallowed" to make deep frozen objects with references to other objects that aren't deep frozen?
Updated by headius (Charles Nutter) about 11 years ago
- Target version set to 2.1.0
Updated by naruse (Yui NARUSE) about 11 years ago
- Target version changed from 2.1.0 to 2.6
Updated by headius (Charles Nutter) about 11 years ago
Has this already been excluded from 2.1.0? May I ask why? We have not finished discussing it and most folks on this issue believe it would be a good feature to have.
After hearing Tony's case about the value of []f and {}f being more useful if they checked for frozen elements, I'm coming around to that idea. So in both cases, the guarantee would be that the object you get back (the Array or Hash) is frozen and the elements it contains are #frozen? (which may or may not say anything about the data they contain). If any elements are not frozen (or not literals, though I believe all literals will be frozen in 2.1), you'd get an error... "cannot create frozen {Array,Hash} with unfrozen elements".
I would like to understand the justification for moving this to "next minor" without more discussion.
Updated by headius (Charles Nutter) about 11 years ago
Also, FWIW, ko1 told me to mark the bugs I was interested in as 2.1, which is why I set this bug for 2.1.
Updated by naruse (Yui NARUSE) about 11 years ago
headius (Charles Nutter) wrote:
Also, FWIW, ko1 told me to mark the bugs I was interested in as 2.1, which is why I set this bug for 2.1.
Therefore we discussed about this on RubyDeveloperMeeting20131001.
Has this already been excluded from 2.1.0? May I ask why? We have not finished discussing it and most folks on this issue believe it would be a good feature to have.
Through the meeting, I thought it is hard to get consensus about this spec, for example shallow or deep,
literal item or variable or methods.
After hearing Tony's case about the value of []f and {}f being more useful if they checked for frozen elements, I'm coming around to that idea. So in both cases, the guarantee would be that the object you get back (the Array or Hash) is frozen and the elements it contains are #frozen? (which may or may not say anything about the data they contain). If any elements are not frozen (or not literals, though I believe all literals will be frozen in 2.1), you'd get an error... "cannot create frozen {Array,Hash} with unfrozen elements".
I would like to understand the justification for moving this to "next minor" without more discussion.
If a feature needs more discussion, it will drop from 2.1.0; this is the decision of I, release manager.
Anyway this feature doesn't affect ruby's stability at the implementation view.
Therefore it can be accepted to 2.1 feature if someone works hard to make the spec and get consensus.
The deadline is the end of October.
Updated by headius (Charles Nutter) about 11 years ago
naruse (Yui NARUSE) wrote:
headius (Charles Nutter) wrote:
Also, FWIW, ko1 told me to mark the bugs I was interested in as 2.1, which is why I set this bug for 2.1.
Therefore we discussed about this on RubyDeveloperMeeting20131001.
It is unfortunate these meetings cannot be attended by other implementers. Was this only for Japanese-speaking contributors?
If a feature needs more discussion, it will drop from 2.1.0; this is the decision of I, release manager.
...
Anyway this feature doesn't affect ruby's stability at the implementation view.
Therefore it can be accepted to 2.1 feature if someone works hard to make the spec and get consensus.
The deadline is the end of October.
Ok, I will try to get consensus on this feature by end of October.
Updated by headius (Charles Nutter) about 11 years ago
I have started a wiki page for the proposal: https://bugs.ruby-lang.org/projects/ruby-trunk/wiki/Frozen_Array_and_Hash_literals_proposal
Please comment here or make comments as edits there.
Updated by enebo (Thomas Enebo) about 11 years ago
My take on this proposal is that deep and shallow freezing is not as important as knowing that the 'f' (which I think should mean fixed and not frozen) is that once the array literal evaluates it will not be possible to change its size. What is inside can change but that is ok. At a grammar level we know we can alloc an array of n elements once and never need to worry about allocating it again.
On top of that if the literal only contains immediate values or frozen/fixed literals then we can do a bunch of other optimizations. Like never populate the array more than once. Or even constant propagate elements to where they are used if we know [] is not overriden. In my mind, the 'f' gives us a whole dimensions of things which we can do internally to improve performance.
For a Ruby programmer it means they can never change the size of the data structure (or in case of a hash never change the keys anymore -- values can change). I don't want to hijack an enhancement but let's do this :)
Updated by enebo (Thomas Enebo) about 11 years ago
Ok I have been talked out of the allowing changing contents of the literal. Without guaranteeing the values never change then it is very limited in when you can do most of the "good" optimizations. They can be done but generally only in the case where you are doing a pure-literal sort of code. Once you assign the literal to a variable all bets are off (unless you can constant propagate it). I rescind my idea :)
I need to think more about frozen literals and come back to this issue :(
Updated by headius (Charles Nutter) about 11 years ago
naruse: Within what group do I need to get consensus? All ruby-core committers?
Updated by headius (Charles Nutter) about 11 years ago
=begin
Note that if #8992 is accepted, the same optimization could apply to arrays and hashes. In other words:
[].freeze
{}.freeze
[:foo, 1, true].freeze
{:foo => 1}.freeze
could all be optimized to return the same object everywhere, every time. For arrays and hashes with non-literal elements, they would not enforce elements being frozen...but neither does .freeze.
Updated by headius (Charles Nutter) about 11 years ago
FWIW, I added #9043 that proposes an #f method added to String that would be a shortcut for #freeze. That might make this optimization harder, since we don't necessarily want to add #f at a global level.
Updated by headius (Charles Nutter) about 11 years ago
- Assignee set to matz (Yukihiro Matsumoto)
- Target version changed from 2.6 to 2.1.0
#8992 landed changes to optimize String#freeze, and #9043 deals with the possibility of adding a shortcut String#f method.
I believe this issue will now decide:
- Should the compiler optimize some form (or all forms) of literal Array and Hash immediately followed by #frozen?
- What forms should lead to optimization?
- Should #f method be added to Array and Hash as well?
I think we need matz to weigh in here. My opinion:
- Compilers should be allowed to optimize .freeze to return the same object when appropriate (i.e. when the elements contained are themselves idempotent). Other cases cannot be optimized.
- A shortcut #f method on Array and Hash would reduce the ugly factor when creating pre-frozen arrays and hashes, and might be nice to have for the same reasons as String#f.
Marking for 2.1 and assigning to matz to get a verdict.
Updated by Anonymous about 11 years ago
My personal opinion on extending optimized #freeze to other literals is that it should be up to each implementation to decide on.
Updated by normalperson (Eric Wong) about 11 years ago
"charliesome (Charlie Somerville)" charliesome@ruby-lang.org wrote:
My personal opinion on extending optimized #freeze to other literals
is that it should be up to each implementation to decide on.
I agree. If an optimization can be accomplished without breaking
compatibility of common/reasonable code, implementations should be free
to optimize as they wish.
Updated by nobu (Nobuyoshi Nakada) almost 11 years ago
- Status changed from Feedback to Rejected
"f" suffix has been removed from string literals.