Project

General

Profile

Actions

Bug #1240

closed

parser bug in 1.8.7 and 1.9.1p0

Added by thomer (Thomer Gil) almost 16 years ago. Updated over 13 years ago.

Status:
Closed
Target version:
-
ruby -v:
ruby 1.9.1p0 (2009-01-30 revision 21907) [i686-linux]
[ruby-core:22637]

Description

=begin

ruby parser accepts first line, not the second.

x y { "#{}".z { } }
x y { "#{}".z do end }
=end

Actions #1

Updated by zenspider (Ryan Davis) almost 16 years ago

=begin

On Mar 2, 2009, at 22:30 , Thomer Gil wrote:

Bug #1240: parser bug in 1.8.7 and 1.9.1p0
http://redmine.ruby-lang.org/issues/show/1240

Author: Thomer Gil
Status: Open, Priority: Normal
Category: core
ruby -v: ruby 1.9.1p0 (2009-01-30 revision 21907) [i686-linux]

ruby parser accepts first line, not the second.

x y { "#{}".z { } }
x y { "#{}".z do end }

1.8.6 as well... this also appears to be the minimimal reproduction.
Remove anything and it parses. How confounding.

If/When this gets fixed, please respond with the patch or commit # so
I can track it. Thanks.

=end

Actions #2

Updated by shyouhei (Shyouhei Urabe) almost 16 years ago

  • Assignee set to matz (Yukihiro Matsumoto)

=begin

=end

Actions #3

Updated by daz (Dave B) almost 16 years ago

=begin
As there's no context for this "bug", I've tried to provide some.

The parser takes different routes depending on what has been seen,
so context is important.

def x(args=nil)
p [:x_args, args]
block_given? and (puts '--> x blk'; yield)
end

def y(args=nil)
p [:y_args, args]
block_given? and (puts '--> y blk'; yield)
end

class String
def z(args=nil)
p [:z_args, args]
block_given? and (puts '--> z blk'; yield)
end
end

x y { "#{}".z { "GIGO\n".display } }

x y { "#{}".z do "GIGO\n".display end } # parser fail

Contrasting with ...

p "#{}".z { "GIGO\n".display }
puts '---'
p "#{}".z do "GIGO\n".display end

... what was "do end" in the original report

expected to bind to?

daz

=end

Actions #4

Updated by zenspider (Ryan Davis) almost 16 years ago

=begin

On Mar 3, 2009, at 13:42 , Dave B wrote:

As there's no context for this "bug", I've tried to provide some.

This isn't a "bug", it is a bug, as I think I've shown below.

Contrasting with ...

p "#{}".z { "GIGO\n".display }
puts '---'
p "#{}".z do "GIGO\n".display end

... what was "do end" in the original report

expected to bind to?

I find this view more clear:

% echo 'p "#{}".z do "GIGO\n".display end' | parse_tree_show
s(:iter,
s(:call,
nil,
:p,
s(:arglist, s(:call, s(:dstr, "", s(:evstr)), :z, s(:arglist)))),
nil,
s(:call, s(:str, "GIGO\n"), :display, s(:arglist)))

so the do/end (aka iter) is bound to the call to #p.

From the original report, minus the empty interpolation:

% echo 'x y { "".z do end }' | parse_tree_show
s(:call,
nil,
:x,
s(:arglist,
s(:iter,
s(:call, nil, :y, s(:arglist)),
nil,
s(:iter, s(:call, s(:str, ""), :z, s(:arglist)), nil)))

the interpolation would replace the s(:str, "") with:

% echo '"#{}"' | parse_tree_show
s(:dstr, "", s(:evstr))

So really there isn't any reason why this shouldn't be parseable. If
this isn't solved by the weekend I'll take a whack at it with
ruby_parser and translate/backport from there. I hate these
codepaths tho and I haven't gotten around to rewriting the string
stack yet, so it isn't fun.

=end

Actions #5

Updated by mame (Yusuke Endoh) almost 15 years ago

  • Assignee changed from matz (Yukihiro Matsumoto) to mame (Yusuke Endoh)

=begin
Hi,

2009/3/3 Thomer Gil :

ruby parser accepts first line, not the second.

x y { "#{}".z { } }
x y { "#{}".z do end }

Interesting. I found similar bugs:

while ("#{}".foo { } ); end # ok
while ("#{}".foo do end); end # parse error

while ->{ 1.times { } }.call; done # ok
while ->{ 1.times do end }.call; done # parse error

The cause is mishandling of cond_stack and cmdarg_stack.
When "#{}" is parsed, the parser pushes a value into these stacks once
(at #{ (tSTRING_DBEG)), and pops twice (at '}' and action of the rule
`string_content'.)

In addition, the parser forgets to push at ->{ (tLAMBEG).

I think it is better for not parser but lexer to push into the stacks
at tSTRING_DBEG. See my patch.

BTW, I also noticed that block call with do' keyword does not work in until' condition:

until begin 1.times { } end do end # ok
until begin 1.times do end end do end # parse error
~~
until if true then 1.times { } end do end # ok
until if true then 1.times do end end do end # parse error
~~
until until true do 1.times { } end do end # ok
until until true do 1.times do end end do end # parse error
~~
until class Foo; 1.times { } ; end do end # ok
until class Foo; 1.times do end; end do end # parse error
~~
until case; when true; 1.times { } ; end do end # ok
until case; when true; 1.times do end; end do end # parse error
~~

This is because the underlined do's are not considered as block call but beginning of until' body.

Although this is confusing a little and can be actually fixed, the fix
needs many COND_PUSH(0)/COND_POP(), which may decrease performance and
code maintenability. In addition, writing such a long and complex
condition directly is absolutely bad (even insane) style.
So, we should accept the above behaviors as spec, I think.

Here is a patch to fix the issue Thomer reported.
I'll commit it unless there is objection.

diff --git a/parse.y b/parse.y
index 340a825..f234ae3 100644
--- a/parse.y
+++ b/parse.y
@@ -4033,14 +4033,10 @@ string_content : tSTRING_CONTENT
$$ = lex_strterm;
lex_strterm = 0;
lex_state = EXPR_BEG;

  •  	COND_PUSH(0);
    
  •  	CMDARG_PUSH(0);
         }
       compstmt '}'
         {
     	lex_strterm = $<node>2;
    
  •  	COND_LEXPOP();
    
  •  	CMDARG_LEXPOP();
         /*%%%*/
     	if ($3) $3->flags &= ~NODE_FL_NEWLINE;
     	$$ = new_evstr($3);
    

@@ -5873,6 +5869,8 @@ parser_parse_string(struct parser_params *parser, NODE *quote)
pushback(c);
return tSTRING_DVAR;
case '{':

  •  COND_PUSH(0);
    
  •  CMDARG_PUSH(0);
     return tSTRING_DBEG;
    
    }
    tokadd('#');
    @@ -6070,6 +6068,8 @@ parser_here_document(struct parser_params *parser, NODE *here)
    pushback(c);
    return tSTRING_DVAR;
    case '{':
  •  COND_PUSH(0);
    
  •  CMDARG_PUSH(0);
     return tSTRING_DBEG;
     }
     tokadd('#');
    

@@ -7314,6 +7314,8 @@ parser_yylex(struct parser_params *parser)
lex_state = EXPR_BEG;
lpar_beg = 0;
--paren_nest;

  •  COND_PUSH(0);
    
  •  CMDARG_PUSH(0);
     return tLAMBEG;
    
    }
    if (IS_ARG() || lex_state == EXPR_END)

--
Yusuke ENDOH
=end

Actions #6

Updated by mame (Yusuke Endoh) almost 15 years ago

=begin
Hi,

2010/4/8 Caleb Clausen :

Yusuke Endoh wrote:

BTW, I also noticed that block call with do' keyword does not work in until' condition:

?until begin 1.times { } ? ?end do end # ok
?until begin 1.times do end end do end # parse error
? ? ? ? ? ? ? ? ? ? ?~~
[snip... more like that]

This is because the underlined do's are not considered as block call but beginning of until' body.

Although this is confusing a little and can be actually fixed, the fix
needs many COND_PUSH(0)/COND_POP(), which may decrease performance and
code maintenability. ?In addition, writing such a long and complex
condition directly is absolutely bad (even insane) style.
So, we should accept the above behaviors as spec, I think.

Wait.... please don't harden any bugs into the specification of the
language. If you want to say that it's too obscure/too hard to fix,
that's fine. WONTFIX is ok, under the circumstances. If you want to
say it's implementation dependent behavior, that's at least
acceptable. But let's not pretend that what's clearly a bug is
intended behavior. (Best of all would be to fix the bug, but I'm not
really asking for that.)

You are right. I guess it be considered as WONTFIX.

My own lexer & parser were able to parse all the examples in this bug
report correctly and I did not need to change anything.

Great.

BTW, how about the following? :-)

ok

while (((((((((((((((((((((((((((((((false)))))))))))))))))))))))))))))))
do; end

parse error

while ((((((((((((((((((((((((((((((((false))))))))))))))))))))))))))))))))
do; end

If your CPU is 64 bit, please run after 32 '(' and ')' are added :-p

IMHO, do_cond (omittable `do' keyword in the part of while/until/for
constructs) brings confusion but little benefit to the Ruby syntax.
How about removal of do_cond in 2.0?

--
Yusuke ENDOH

=end

Actions #7

Updated by coatl (caleb clausen) almost 15 years ago

=begin
On 4/8/10, Yusuke ENDOH wrote:

2010/4/8 Caleb Clausen :

Wait.... please don't harden any bugs into the specification of the
language. If you want to say that it's too obscure/too hard to fix,
that's fine. WONTFIX is ok, under the circumstances. If you want to
say it's implementation dependent behavior, that's at least
acceptable. But let's not pretend that what's clearly a bug is
intended behavior. (Best of all would be to fix the bug, but I'm not
really asking for that.)

You are right. I guess it be considered as WONTFIX.

Thank you.

BTW, how about the following? :-)

ok

while (((((((((((((((((((((((((((((((false)))))))))))))))))))))))))))))))
do; end

parse error

while ((((((((((((((((((((((((((((((((false))))))))))))))))))))))))))))))))
do; end

Ooooo! Both of those don't parse right in redparse. But not for the reason you seem to expect; it's objecting to the do on its own line. That is to say, this also doesn't parse:

while false
do; end

Thanks for the bug. I'll fix it right away.

Taking out the newline before do makes them parse right again. (For MRI too, tho.)

Incidentally, all the ruby versions I have (1.8.6, 1.9.0, 1.9.1) also object to do after a newline. I guess this was added (relatively) recently?

If your CPU is 64 bit, please run after 32 '(' and ')' are added :-p

This was on a 32 bit cpu. I wouldn't anticipate any problems with any number of nested parentheses. (Until, that is, you run out of memory.)

IMHO, do_cond (omittable `do' keyword in the part of while/until/for
constructs) brings confusion but little benefit to the Ruby syntax.
How about removal of do_cond in 2.0?

Hear, hear! I agree wholeheartedly. (Unless there's someone out there who actually likes this? Speak now or forever hold your peace.)
=end

Actions #8

Updated by shyouhei (Shyouhei Urabe) over 14 years ago

  • Status changed from Open to Assigned

=begin

=end

Updated by mame (Yusuke Endoh) over 13 years ago

  • Status changed from Assigned to Closed

The original issue that OP reported was already fixed at r27359.
The commit caused a regression [ruby-core:29579], and nobu fixed
it at r27387.

So, closed. Thanks.

--
Yusuke Endoh

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0