Feature #20331
closedShould parser warn hash duplication and when clause?
Description
Background¶
Right now, parser warns duplicated hash keys (#1) and when clause (#2).
For example,
{1 => :a, 1 => :b}
# => warning: key 1 is duplicated and overwritten on line 1
case 2
when 1, 1
else
end
# => test.rb:2: warning: duplicated `when' clause with line 2 is ignored
The parser compares different cardinality numbers.
{
1 => :a,
0x1 => :b,
0b1 => :b,
0d1 => :b,
0o1 => :b,
}
# => test.rb:2: warning: key 1 is duplicated and overwritten on line 3
# => test.rb:3: warning: key 1 is duplicated and overwritten on line 4
# => test.rb:4: warning: key 1 is duplicated and overwritten on line 5
# => test.rb:5: warning: key 1 is duplicated and overwritten on line 6
Problem¶
Currently this is implemeted by converting string like "123"
to Ruby Object and compare them.
It's needed to remove Ruby Object from parse.y for Universal Parser.
I created PR https://github.com/ruby/ruby/pull/10079 which implements bignum for parse.y without dependency on Ruby Object, however nobu and mame express concern about the cost and benefit of implmenting bignum for parser.
I want to discuss which is the best approach for this problem.
By the way, it's needed to calculate irreducible fraction for Rational key if we will keep warning messages.
$ ruby -wc -e '{10.2r => :a, 10.2r => :b}'
-e:1: warning: key (51/5) is duplicated and overwritten on line 1
-e:1: warning: unused literal ignored
Syntax OK
Options¶
1. Warnings on parser¶
Pros:
- Users of Universal Parser don't need to implement warnings by themselves. I guess developers of other Ruby implementation may get benefit of reducing their effort.
- Warnings are shown by
ruby -wc
.
Cons:
- We need to maintain bignum implementation for parser.
There are two approaches for this option.
1-1. Implement bignum for parser¶
The PR is this approach, implementing sub set of Ruby bignum for parser.
1-2. Extract existing bignum implementation then use it¶
Make existing bignum implementation to be independent of Ruby Object and use it from both bignum.c and parse.y.
2. Moving warnings logic into compile phase¶
We can use Ruby Object in compile.c. Then moving the logic into compile.c solves this problem.
Pros:
- No need to implement bignum for parser.
Cons:
- Users of Universal Parser need to implement warnings by themselves.
- Warnings are not shown by
ruby -wc
.
Updated by byroot (Jean Boussier) 8 months ago
Warnings are not shown by ruby -wc.
Couldn't ruby -c
be made to compile as well? It's marginally slower but would make sense to me. That would remove the main con from solution 2.
Updated by kddnewton (Kevin Newton) 8 months ago
You don't need an entire bignum library, you only need multiplication and addition. Prism has all of these warnings implemented in https://github.com/ruby/ruby/blob/1e7ee871cbf10375ca149a32d71a29e5e60eed6c/prism/util/pm_integer.c. You're welcome to use that for universal parser if you'd like.
Updated by tenderlovemaking (Aaron Patterson) 8 months ago
I think it's important for the parser to expose these warnings. If someone uses the parser / AST for building a language server, then it's important to show warnings to the user without running the code.
In general, I think it's important to provide as many warnings and errors as possible statically so that language servers can show them to users.
Updated by zverok (Victor Shepelev) 8 months ago
As a language user, I would expect the parser to be able to warn about the keys that are literally same (i.e. same parser nodes), like {10.2r => :a, ... 10.2r => :b}
or (obviously more frequently) {a: 1, .... a: 2}
, but I wouldn't be mad/disappointed if:
- this warning would use, again, the literal value of the node ("10.2r" instead of "(51/5)")
- the warning at this stage would not catch keys that are same by value but not literally
{10.2r => :a, 51/5r => :b}
or{1 => :a, 0b1 => :b}
(given that this is still warned in some later stages)
So, maybe some naive comparing and reporting just by node literal content on the parsing stage would do?
After all, I believe that the most common mistake caught by the warning is writing exactly the same key in a literal hash, not subtleties like somebody regularly writes hashes consisting of 0x1
0b1
0d1` keys (this is legitimate, but seemingly much less frequent case).
Updated by byroot (Jean Boussier) 8 months ago
then it's important to show warnings to the user without running the code.
Indeed. But compiling is not running. That said, I see how this would force language servers to have Ruby around, and some may want to implement a Ruby language server in another language.
So, maybe some naive comparing and reporting just by node literal content on the parsing stage would do?
I think it would be a nice compromise, but you'd need a way for the compiler not to warn again on something the parser already warned about?
Updated by kddnewton (Kevin Newton) 8 months ago
I understand this ticket as "should the parser be responsible for parse integer values" (because if you have the integer values, these warnings are trivial to implement). I think the parser should absolutely be responsible for parsing integer values, for a couple of reasons:
- Users of the parser want to be able to drop their reference to the source code, which they can't do if the parser doesn't parse this. (We found this requirement while talking to Ruby implementers when designing Prism.) This impacts things like lazy method definitions, which JRuby implements.
- The integer values are already in the prism API. Because matz has said many times that Prism's AST is going to be the public Ruby AST, this means that in order to line these up parse.y will need to parse these values.
- We want to maximize the amount of feedback we can give users in language servers and linters.
To respond to some comments so far:
Indeed. But compiling is not running.
That's true, but as you note compiling is runtime/tooling-specific. It's not going to be portable.
So, maybe some naive comparing and reporting just by node literal content on the parsing stage would do? After all, I believe that the most common mistake caught by the warning is writing exactly the same key in a literal hash, not subtleties like somebody regularly writes hashes consisting of 0x1 0b1 0d1` keys (this is legitimate, but seemingly much less frequent case).
I think this is the opposite. I doubt many people are going to write { 10 => foo, 10 => bar }
but it would be much easier to make a mistake by writing { 10 => foo, 0xA => bar }
.
Updated by matz (Yukihiro Matsumoto) 8 months ago
- parser should warn only minimal and apparent cases, no base conversion, no bignum allocation. Handling
_
in the number literals is on the border. - I don't want to add complicated checks since prism may be used for small devices too (e.g., PicoRuby)
Matz.
Updated by kddnewton (Kevin Newton) 8 months ago
Thank you for the response Matz. I just want to clarify what you're asking for the implementation to be, based on what you're saying (let me know if I misunderstand).
parser should warn only minimal and apparent cases, no base conversion, no bignum allocation. Handling _ in the number literals is on the border.
I think you're saying that the parser should warn for { 1 => foo, 1 => bar }
but it should not warn for { 1 => foo, 0b1 => bar }
. But we don't want to remove the warnings, so the compiler will need to warn for those cases? I think in that case the compiler will need to build the hash, check for duplicates, and then only warn if they have different bases. Is that the desired behavior?
I don't want to add complicated checks since prism may be used for small devices too (e.g., PicoRuby)
To be clear, prism already runs on PicoRuby with this warning enabled, and it works just fine. Can you clarify what concerns you have here?
Updated by mame (Yusuke Endoh) 8 months ago
But we don't want to remove the warnings, so the compiler will need to warn for those cases?
At the dev meeting, Matz said that the compiler doesn't need to warn those cases. He said that just remove the warning except literally conflict cases.
To be clear, prism already runs on PicoRuby with this warning enabled, and it works just fine. Can you clarify what concerns you have here?
Matz said he wants to make the footprint as small as possible.
Subtle cases like { 1.0 => a, 1.00 => b }
and { "\x00" => a, "\0" => b }
were also discussed, and some committers wants to keep the warning for the cases, but Matz said no need to warn such cases.
In my personal opinion: the cases that I remember this warning useful are when I copy-pasted a key-value pair of Hash literal or when
clause of case
statement and forgot to edit them. I don't remember any real-world conflicts like when 10; when 0xA
. So I also think that warning a literally conflict makes sense. Still a little nervous about something like when 1.0; when 1.00
, though.
Updated by kddnewton (Kevin Newton) 8 months ago
Subtle cases like { 1.0 => a, 1.00 => b } and { "\x00" => a, "\0" => b } were also discussed, and some committers wants to keep the warning for the cases, but Matz said no need to warn such cases.
Oh I see, I didn't understand that the warnings were being removed entirely for universal parser. I agree that's quite a bit better than splitting them across the parser and compiler at least.
Updated by yui-knk (Kaneko Yuichiro) 7 months ago
- Status changed from Open to Closed
Applied in changeset git|799e854897856e431c03a5122252358e2c57aff2.
[Feature #20331] Simplify parser warnings for hash keys duplication and when clause duplication
This commit simplifies warnings for hash keys duplication and when clause duplication,
based on the discussion of https://bugs.ruby-lang.org/issues/20331.
Warnings are reported only when strings are same to ohters.