Project

General

Profile

Bug #4544

Ripperで「:"a \n b \n c"」を正常にパースできない

Added by exkazuu (Kazunori SAKAMOTO) over 7 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-dev:<unknown>]

Description

=begin
Ripper::SexpBuilder.new(%Q!:"a \n b \n c"!).parse にて、on_tstring_content メソッドが二回連続で呼び出されます。

検証済みバージョン

  • ruby 1.9.3dev (2011-03-31 trunk 31223) [i686-linux]
  • ruby 1.9.2p180 (2011-02-18 revision 30909) [i686-linux]
  • ruby 1.9.1p431 (2011-02-18 revision 30908) [i686-linux]

検証用スクリプト
require 'ripper'

class Parser < Ripper::SexpBuilder
ms = Ripper::SexpBuilder.new('').methods
defs = ms.map { |s| s.to_s }
.select { |s| s.start_with?('on_') }
.map { |s| %Q{
def #{s}(*args)
print '#{s}: '
p args
end
}}.join
eval(defs)
end

def parse(src)
puts "*" + src.inspect + "*"
Parser.new(src).parse
puts
end

parse(%Q!:"a \n b \n c"!)
parse(%Q!:"a \n b c"!)
parse(%Q!:"a b c"!)

実行結果
----":\"a \n b \n c\""----
on_symbeg: [":\""]
on_tstring_content: ["a \n b \n"]
on_tstring_content: [" c"]
on_xstring_new: []
on_xstring_add: [[], [" c"]]
on_tstring_end: ["\""]
on_dyna_symbol: [[[], [" c"]]]
on_stmts_new: []
on_stmts_add: [[], [[[], [" c"]]]]
on_program: [[[], [[[], [" c"]]]]]

----":\"a \n b c\""----
on_symbeg: [":\""]
on_tstring_content: ["a \n"]
on_tstring_content: [" b c"]
on_xstring_new: []
on_xstring_add: [[], [" b c"]]
on_tstring_end: ["\""]
on_dyna_symbol: [[[], [" b c"]]]
on_stmts_new: []
on_stmts_add: [[], [[[], [" b c"]]]]
on_program: [[[], [[[], [" b c"]]]]]

----":\"a b c\""----
on_symbeg: [":\""]
on_tstring_content: ["a b c"]
on_xstring_new: []
on_xstring_add: [[], ["a b c"]]
on_tstring_end: ["\""]
on_dyna_symbol: [[[], ["a b c"]]]
on_stmts_new: []
on_stmts_add: [[], [[[], ["a b c"]]]]
on_program: [[[], [[[], ["a b c"]]]]]
=end

Associated revisions

Revision ce90fcdb
Added by nobu (Nobuyoshi Nakada) over 7 years ago

  • parse.y (parser_parse_string): flush delayed token. based on a patch by Masaya Tarui in . Bug #4544
  • parse.y (yylex): revert r24557. delayed token at the end of string should be flushed already by the above change.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@32067 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 32067
Added by nobu (Nobuyoshi Nakada) over 7 years ago

  • parse.y (parser_parse_string): flush delayed token. based on a patch by Masaya Tarui in . Bug #4544
  • parse.y (yylex): revert r24557. delayed token at the end of string should be flushed already by the above change.

Revision 32067
Added by nobu (Nobuyoshi Nakada) over 7 years ago

  • parse.y (parser_parse_string): flush delayed token. based on a patch by Masaya Tarui in . Bug #4544
  • parse.y (yylex): revert r24557. delayed token at the end of string should be flushed already by the above change.

Revision 32067
Added by nobu (Nobuyoshi Nakada) over 7 years ago

  • parse.y (parser_parse_string): flush delayed token. based on a patch by Masaya Tarui in . Bug #4544
  • parse.y (yylex): revert r24557. delayed token at the end of string should be flushed already by the above change.

Revision 32067
Added by nobu (Nobuyoshi Nakada) over 7 years ago

  • parse.y (parser_parse_string): flush delayed token. based on a patch by Masaya Tarui in . Bug #4544
  • parse.y (yylex): revert r24557. delayed token at the end of string should be flushed already by the above change.

Revision 32067
Added by nobu (Nobuyoshi Nakada) over 7 years ago

  • parse.y (parser_parse_string): flush delayed token. based on a patch by Masaya Tarui in . Bug #4544
  • parse.y (yylex): revert r24557. delayed token at the end of string should be flushed already by the above change.

Revision 78f0aebe
Added by yugui (Yuki Sonoda) about 7 years ago

merges r32067 from trunk into ruby_1_9_2.

  • parse.y (parser_parse_string): flush delayed token. based on a patch by Masaya Tarui in . Bug #4544
  • parse.y (yylex): revert r24557. delayed token at the end of string should be flushed already by the above change.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_9_2@32384 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision c039cd8e
Added by yugui (Yuki Sonoda) about 7 years ago

merges r32067 from trunk into ruby_1_9_2.

  • parse.y (parser_parse_string): flush delayed token. based on a patch by Masaya Tarui in . Bug #4544
  • parse.y (yylex): revert r24557. delayed token at the end of string should be flushed already by the above change.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_9_2@32385 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

History

#1 [ruby-dev:43762] Updated by tarui (Masaya Tarui) over 7 years ago

ripperのコードを読んでみましたが通常のparse.cとの何が違うか理解するのが難しいですね。

このチケットの原因になってる 、yylex でdelayed_tokenをdispatchするコード(parse.y:7855)の直後のreturnを
r24557([Bug #1071])で削除していますが、
ripper_dispatch_delayed_token(parser, t);
ripper_dispatch_scan_event(parser, t);
と実行される可能性があるのは違和感を感じます。

そもそもこのif文に入るのはどういう場合を想定していたんでしょう?

patchをhere document部分のripperコードを参考に書いてみましたが、
想定がわからないのでyylexの部分は触ってません。
自信はないのでレビューしてもらえませんでしょうか?(中田さん?)

diff --git a/parse.y b/parse.y
index 06f96ce..31bdc6f 100644
--- a/parse.y
+++ b/parse.y
@@ -5984,6 +5984,21 @@ parser_parse_string(struct parser_params *parser, NODE *quote)

 tokfix();
 set_yylval_str(STR_NEW3(tok(), toklen(), enc, func));

+
+#ifdef RIPPER

  • if (!NIL_P(parser->delayed)){
  • if(lex_p - parser->tokp > 0 ){
  • rb_str_append(parser->delayed,
  • STR_NEW3(parser->tokp,
  • lex_p - parser->tokp,
  • enc,
  • func));
  • }
  • ripper_dispatch_delayed_token(parser, tSTRING_CONTENT);
  • parser->tokp = lex_p;
  • } +#endif + return tSTRING_CONTENT; }

diff --git a/test/ripper/test_scanner_events.rb b/test/ripper/test_scanner_events.rb
index 25e4b13..3c62f52 100644
--- a/test/ripper/test_scanner_events.rb
+++ b/test/ripper/test_scanner_events.rb
@@ -67,8 +67,7 @@ class TestRipper::ScannerEvents < Test::Unit::TestCase
[[3, 0], :on_heredoc_end, "EOS"]],
Ripper.lex("<<EOS\nheredoc\nEOS")
assert_equal [[[1, 0], :on_regexp_beg, "/"],

  • [[1, 1], :on_tstring_content, "foo\n"],
  • [[2, 0], :on_tstring_content, "bar"],
  • [[1, 1], :on_tstring_content, "foo\nbar"], [[2, 3], :on_regexp_end, "/"]], Ripper.lex("/foo\nbar/") end

#2 [ruby-dev:43771] Updated by nobu (Nobuyoshi Nakada) over 7 years ago

  • ruby -v changed from ruby 1.9.2p180 (2011-02-18 revision 30909) [i686-linux] to -

なかだです。

At Sun, 12 Jun 2011 09:18:46 +0900,
Masaya Tarui wrote in :

このチケットの原因になってる 、yylex でdelayed_tokenをdispatchするコード(parse.y:7855)の直後のreturnを
r24557([Bug #1071])で削除していますが、
ripper_dispatch_delayed_token(parser, t);
ripper_dispatch_scan_event(parser, t);
と実行される可能性があるのは違和感を感じます。

そもそもこのif文に入るのはどういう場合を想定していたんでしょう?

ちょっと思い出せませんが、樽家さんの修正で不要になるようです。

patchをhere document部分のripperコードを参考に書いてみましたが、
想定がわからないのでyylexの部分は触ってません。
自信はないのでレビューしてもらえませんでしょうか?(中田さん?)

  • rb_str_append(parser->delayed,
  • STR_NEW3(parser->tokp,
  • lex_p - parser->tokp,
  • enc,
  • func));

新しくStringを作ってappendしなくても、rb_enc_str_buf_cat()という
のがあります。

diff --git i/parse.y w/parse.y
index 06f96ce..60e5a3b 100644
--- i/parse.y
+++ w/parse.y
@@ -5984,6 +5984,18 @@ parser_parse_string(struct parser_params *parser, NODE *quote)

  tokfix();
  set_yylval_str(STR_NEW3(tok(), toklen(), enc, func));

+
+#ifdef RIPPER

  • if (!NIL_P(parser->delayed)){
  • ptrdiff_t len = lex_p - parser->tokp;
  • if (len > 0) {
  • rb_enc_str_buf_cat(parser->delayed, parser->tokp, len, enc);
  • }
  • ripper_dispatch_delayed_token(parser, tSTRING_CONTENT);
  • parser->tokp = lex_p;
  • } +#endif + return tSTRING_CONTENT; }

@@ -7853,6 +7865,7 @@ yylex(void *p)
#ifdef RIPPER
if (!NIL_P(parser->delayed)) {
ripper_dispatch_delayed_token(parser, t);

  • return t; } if (t != 0) ripper_dispatch_scan_event(parser, t); diff --git i/test/ripper/test_scanner_events.rb w/test/ripper/test_scanner_events.rb index 25e4b13..d89e50e 100644 --- i/test/ripper/test_scanner_events.rb +++ w/test/ripper/test_scanner_events.rb @@ -67,10 +67,17 @@ class TestRipper::ScannerEvents < Test::Unit::TestCase [[3, 0], :on_heredoc_end, "EOS"]], Ripper.lex("<<EOS\nheredoc\nEOS") assert_equal [[[1, 0], :on_regexp_beg, "/"],
  • [[1, 1], :on_tstring_content, "foo\n"],
  • [[2, 0], :on_tstring_content, "bar"],
  • [[1, 1], :on_tstring_content, "foo\nbar"], [[2, 3], :on_regexp_end, "/"]], Ripper.lex("/foo\nbar/")
  • assert_equal [[[1, 0], :on_regexp_beg, "/"],
  • [[1, 1], :on_tstring_content, "foo\n\u3020"],
  • [[2, 3], :on_regexp_end, "/"]],
  • Ripper.lex("/foo\n\u3020/")
  • assert_equal [[[1, 0], :on_tstring_beg, "'"],
  • [[1, 1], :on_tstring_content, "foo\n\xe3\x80\xa0"],
  • [[2, 3], :on_tstring_end, "'"]],
  •             Ripper.lex("'foo\n\xe3\x80\xa0'")
    

    end

    def test_location
    @@ -534,6 +541,10 @@ class TestRipper::ScannerEvents < Test::Unit::TestCase
    scan('tstring_content', '"abc#{1}def"')
    assert_equal ['sym'],
    scan('tstring_content', ':"sym"')

  • assert_equal ['a b c'],

  •             scan('tstring_content', ':"a b c"')
    
  • assert_equal ["a\nb\nc"],

  •             scan('tstring_content', ":'a\nb\nc'")
    

    end

    def test_tstring_end

--
--- 僕の前にBugはない。
--- 僕の後ろにBugはできる。
中田 伸悦

#3 [ruby-dev:43771] Updated by nobu (Nobuyoshi Nakada) over 7 years ago

なかだです。

At Sun, 12 Jun 2011 09:18:46 +0900,
Masaya Tarui wrote in :

このチケットの原因になってる 、yylex でdelayed_tokenをdispatchするコード(parse.y:7855)の直後のreturnを
r24557([Bug #1071])で削除していますが、
ripper_dispatch_delayed_token(parser, t);
ripper_dispatch_scan_event(parser, t);
と実行される可能性があるのは違和感を感じます。

そもそもこのif文に入るのはどういう場合を想定していたんでしょう?

ちょっと思い出せませんが、樽家さんの修正で不要になるようです。

patchをhere document部分のripperコードを参考に書いてみましたが、
想定がわからないのでyylexの部分は触ってません。
自信はないのでレビューしてもらえませんでしょうか?(中田さん?)

  • rb_str_append(parser->delayed,
  • STR_NEW3(parser->tokp,
  • lex_p - parser->tokp,
  • enc,
  • func));

新しくStringを作ってappendしなくても、rb_enc_str_buf_cat()という
のがあります。

diff --git i/parse.y w/parse.y
index 06f96ce..60e5a3b 100644
--- i/parse.y
+++ w/parse.y
@@ -5984,6 +5984,18 @@ parser_parse_string(struct parser_params *parser, NODE *quote)

  tokfix();
  set_yylval_str(STR_NEW3(tok(), toklen(), enc, func));

+
+#ifdef RIPPER

  • if (!NIL_P(parser->delayed)){
  • ptrdiff_t len = lex_p - parser->tokp;
  • if (len > 0) {
  • rb_enc_str_buf_cat(parser->delayed, parser->tokp, len, enc);
  • }
  • ripper_dispatch_delayed_token(parser, tSTRING_CONTENT);
  • parser->tokp = lex_p;
  • } +#endif + return tSTRING_CONTENT; }

@@ -7853,6 +7865,7 @@ yylex(void *p)
#ifdef RIPPER
if (!NIL_P(parser->delayed)) {
ripper_dispatch_delayed_token(parser, t);

  • return t; } if (t != 0) ripper_dispatch_scan_event(parser, t); diff --git i/test/ripper/test_scanner_events.rb w/test/ripper/test_scanner_events.rb index 25e4b13..d89e50e 100644 --- i/test/ripper/test_scanner_events.rb +++ w/test/ripper/test_scanner_events.rb @@ -67,10 +67,17 @@ class TestRipper::ScannerEvents < Test::Unit::TestCase [[3, 0], :on_heredoc_end, "EOS"]], Ripper.lex("<<EOS\nheredoc\nEOS") assert_equal [[[1, 0], :on_regexp_beg, "/"],
  • [[1, 1], :on_tstring_content, "foo\n"],
  • [[2, 0], :on_tstring_content, "bar"],
  • [[1, 1], :on_tstring_content, "foo\nbar"], [[2, 3], :on_regexp_end, "/"]], Ripper.lex("/foo\nbar/")
  • assert_equal [[[1, 0], :on_regexp_beg, "/"],
  • [[1, 1], :on_tstring_content, "foo\n\u3020"],
  • [[2, 3], :on_regexp_end, "/"]],
  • Ripper.lex("/foo\n\u3020/")
  • assert_equal [[[1, 0], :on_tstring_beg, "'"],
  • [[1, 1], :on_tstring_content, "foo\n\xe3\x80\xa0"],
  • [[2, 3], :on_tstring_end, "'"]],
  •             Ripper.lex("'foo\n\xe3\x80\xa0'")
    

    end

    def test_location
    @@ -534,6 +541,10 @@ class TestRipper::ScannerEvents < Test::Unit::TestCase
    scan('tstring_content', '"abc#{1}def"')
    assert_equal ['sym'],
    scan('tstring_content', ':"sym"')

  • assert_equal ['a b c'],

  •             scan('tstring_content', ':"a b c"')
    
  • assert_equal ["a\nb\nc"],

  •             scan('tstring_content', ":'a\nb\nc'")
    

    end

    def test_tstring_end

--
--- 僕の前にBugはない。
--- 僕の後ろにBugはできる。
中田 伸悦

#4 Updated by nobu (Nobuyoshi Nakada) over 7 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r32067.
Kazunori, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • parse.y (parser_parse_string): flush delayed token. based on a patch by Masaya Tarui in . Bug #4544
  • parse.y (yylex): revert r24557. delayed token at the end of string should be flushed already by the above change.

#5 [ruby-dev:44004] Updated by exkazuu (Kazunori SAKAMOTO) about 7 years ago

Thank you for solving this issue !

Also available in: Atom PDF