Feature #1493


[patch] lex_state as bit field / IS_lex_state() macro

Added by daz (Dave B) about 13 years ago. Updated over 9 years ago.

Target version:



? Changelog:

Represent lex_state by bit field instead of serial integer enum

so that single or multiple values can be checked together using

unifying macro IS_lex_state(). States remain mutually exclusive.


Each use of current macro IS_BEG() ..
#define IS_BEG() (lex_state == EXPR_BEG || lex_state == EXPR_MID ||
lex_state == EXPR_VALUE || lex_state == EXPR_CLASS)

.. results in up to 4 tests which can be reduced to 1.

An extreme use case ..

     if (IS_BEG() ||
         lex_state == EXPR_DOT ||
         IS_ARG()) {

.. requires up to 7 tests on lex_state.

To my knowledge, compilers can't optimize this.
As a bit field, there's no need to:

     if (IS_lex_state( EXPR_BEG_ANY | EXPR_ARG_ANY | EXPR_DOT ))

.. reduces to: if (lex_state & (test_bits)) // TRUE/FALSE

All changes in this patch were scripted to eliminate the possibility
of typos. Where some replaced sections looked similar or the same,
they were verified with MD5sum before applying!
Multiple state tests were merged using simple, strict logic and any
"surprise" code would have been left unchanged.

Therefore, it should not be necessary to check every line of the patch;
I hope that a check of a random sample will give confidence in the rest.

There were several repeated switches which could have been replaced with:

lex_state = (IS_lex_state( EXPR_FNAME | EXPR_DOT )) ? EXPR_ARG : EXPR_BEG

but I assumed you would prefer if-else for flexibility and legibility:

if (IS_lex_state( EXPR_FNAME | EXPR_DOT )) {
    lex_state = EXPR_ARG;
else {
    lex_state = EXPR_BEG;

Say what you don't like about anything - it might be easy to change. :)

Wouldn't it be nice to use some spare bits to combine other state?

     if ((lex_state == EXPR_BEG && !cmd_state) || ...
     if (IS_ARG() && space_seen && ...
  • I haven't explored those, yet.

My motivation was not to reduce processor heat. ;)
In the future, if we need to break the dependence on Yacc/Bison, these
state transitions might need to become token information and the change
makes them far easier to store.
( e.g. Any of 'THESE states' => 'THIS state' )



IS_LEX.patch (13.2 KB) IS_LEX.patch daz (Dave B), 05/20/2009 02:18 AM
Actions #1

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

  • Category set to core
  • Assignee set to matz (Yukihiro Matsumoto)
  • Priority changed from Normal to 3



Actions #2

Updated by znz (Kazuhiro NISHIYAMA) about 12 years ago

  • Category set to core
  • Status changed from Open to Assigned
  • Target version set to 2.0.0



Updated by mame (Yusuke Endoh) over 10 years ago

  • Assignee changed from matz (Yukihiro Matsumoto) to nobu (Nobuyoshi Nakada)

Nobu, aren't you interested in this? Could you please review and
import the patch if it looks good to you?

This seems to be just a patch for refactoring. Though, the purpose
is not so clear to me. (Who is planning to break the dependence on
bison?) But, the patch seems benign (I read only the ticket, not
the patch itself).

Yusuke Endoh

Actions #4

Updated by nobu (Nobuyoshi Nakada) over 9 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r37338.
Dave, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

parse.y: bit field lex_state

  • parse.y (enum lex_state_e): [EXPERIMENTAL] lex_state as bit field /
    IS_lex_state() macro. based on the patch by Dave B in
    [ruby-core:23503]. [Feature #1493]

Also available in: Atom PDF