Project

General

Profile

Actions

Feature #3845

closed

"in" infix operator

Added by mame (Yusuke Endoh) over 14 years ago. Updated over 12 years ago.

Status:
Rejected
Target version:
-
[ruby-core:32454]

Description

=begin
Hi,

I'd propose "in" infix operator.

( in ) yields true when is included in .
Otherwise it yields false.

p "found" if 1 in 1, 2, 3 #=> found
p "not found" if 0 in 1, 2, 3 #=> not found

"in" operator is clearer to the reader than Array#include?:

p "found" if [1, 2, 3].include?(1)
p "not found" if [1, 2, 3].include?(0)

This proposal is similar to Object#in? proposed in [ruby-core:23543].
But there are two differences:

  • "in" operator does not pollute name space of Object class

  • each candidate of "in" is evaluated lazily; for example,

    1 in 1, 2, foo()

    does not call the method "foo" because 1 is found before that.

Note that this proposal ensures the syntax compatibility, since
"in" is already a keyword for "for" statement. But "for" statement
is rarely used. This proposal utilizes the rarely-used keyword.

I wrote an experimental patch. It implements the operator as a
syntactic sugar to "case" statement:

in
=> (case ; when ; true; else false; end)

The patch causes no parser conflict.

One more thing. The following expression is rejected:

foo(x in 1, 2, 3)

This is because it is ambiguous; this expression can be interpreted
as three ways:

foo((x in 1), 2, 3)
foo((x in 1, 2), 3)
foo((x in 1, 2, 3))

You need write parentheses explicitly.

What do you think?

diff --git a/parse.y b/parse.y
index e085088..64318bd 100644
--- a/parse.y
+++ b/parse.y
@@ -745,6 +745,7 @@ static void token_info_pop(struct parser_params*, const char token);
%nonassoc modifier_if modifier_unless modifier_while modifier_until
%left keyword_or keyword_and
%right keyword_not
+%nonassoc keyword_in
%nonassoc keyword_defined
%right '=' tOP_ASGN
%left modifier_rescue
@@ -1258,6 +1259,14 @@ expr : command_call
$$ = dispatch2(unary, ripper_id2sym('!'), $2);
%
/
}

  •  | expr keyword_in args
    
  •      {
    
  •      /*%%%*/
    
  •  	$$ = NEW_CASE($1, NEW_WHEN($3, NEW_TRUE(), NEW_FALSE()));
    
  •      /*%
    
  •  	$$ = dispatch2(in, $1, $3);
    
  •      %*/
    
  •      }
     | arg
     ;
    

diff --git a/test/ripper/test_parser_events.rb b/test/ripper/test_parser_events.rb
index 5d76941..6005457 100644
--- a/test/ripper/test_parser_events.rb
+++ b/test/ripper/test_parser_events.rb
@@ -1107,4 +1107,8 @@ class TestRipper::ParserEvents < Test::Unit::TestCase
parse('/', :compile_error) {|msg| compile_error = msg}
assert_equal("unterminated regexp meets end of file", compile_error)
end
+

  • def test_in
  • assert_equal("[in(1,[1,2,3])]", parse('1 in 1, 2, 3'))
  • end
    end if ripper_test

--
Yusuke Endoh
=end


Files

in.expression.diff (698 Bytes) in.expression.diff adgar (Michael Edgar), 07/10/2011 09:22 AM

Related issues 1 (0 open1 closed)

Has duplicate Ruby master - Feature #4402: Include an "in" operatorClosed02/16/2011Actions
Actions #1

Updated by matz (Yukihiro Matsumoto) over 14 years ago

=begin
Hi,

In message "Re: [Ruby 1.9-Feature#3845][Open] "in" infix operator"
on Fri, 17 Sep 2010 19:30:27 +0900, Yusuke Endoh writes:

|Hi,
|
|I'd propose "in" infix operator.
|
|( in ) yields true when is included in .
|Otherwise it yields false.

I am neutral for this proposal. But the patch uses "case" internally
thus comparison is done by "===". Is this expected behavior?

						matz.

=end

Actions #2

Updated by mame (Yusuke Endoh) over 14 years ago

=begin
Matz,

Thank you for your comment!

|I'd propose "in" infix operator.
|
|( in ) yields true when is included in .
|Otherwise it yields false.

I am neutral for this proposal. But the patch uses "case" internally
thus comparison is done by "===". Is this expected behavior?

The patch is just proof of concept.
I was not so particular about the implementation. But, I thought of
the following code:

if n in 5..10
# ...
end

This code works only when "===" is used (of course, unless range is
specially handled). So I prefer "===".

However, I'm happy to rewrite a patch if you say "==" should be used.

--
Yusuke Endoh
=end

Actions #3

Updated by Eregon (Benoit Daloze) over 14 years ago

=begin
On 17 September 2010 12:30, Yusuke Endoh wrote:

Feature #3845: "in" infix operator

What do you think?

It is indeed more elegant than #include?, and reusing "in" is nice.

I am somehow used to the question mark of #in?, but being a keyword,
it would be weird.

 do_sth if element in collection

 do_sth if collection.include? element

Yep, that is definitely nicer.

| But, I thought of the following code:
|
| if n in 5..10
| # ...
| end
|
| This code works only when "===" is used (of course, unless range is
| specially handled). So I prefer "===".

Maybe if there is only one element to the right of "in",
it should be checked if having an #include? method and then call it ?
(or check if it is Enumerable)

Or maybe always use #include? when possible, and #== otherwise ?

I think #=== is not really appropriate as it rarely check for
inclusion, and would likely lead to unexpected results.

Regards,
Benoit Daloze

=end

Actions #4

Updated by now (Nikolai Weibull) over 14 years ago

=begin
On Tue, Sep 21, 2010 at 18:53, Benoit Daloze wrote:

   do_sth if element in collection

   do_sth if collection.include? element

Yep, that is definitely nicer.

do_sth if element in elements

do_sth if elements.include? element

=end

Actions #5

Updated by mame (Yusuke Endoh) over 14 years ago

=begin
Hi,

2010/9/22 Benoit Daloze :

On 17 September 2010 12:30, Yusuke Endoh wrote:

Feature #3845: "in" infix operator

What do you think?

It is indeed more elegant than #include?, and reusing "in" is nice.

Thanks :-)

| But, I thought of the following code:
|
| if n in 5..10
| # ...
| end
|
| This code works only when "===" is used (of course, unless range is
| specially handled). So I prefer "===".

Maybe if there is only one element to the right of "in",
it should be checked if having an #include? method and then call it ?
(or check if it is Enumerable)

Hmm. I received the similar opinion (via Japanese twitter).
I think that that makes the semantics complex (for me), but it is ok
as long as the behavior of usual cases is intuitive.

Or maybe always use #include? when possible, and #== otherwise ?

I think #=== is not really appropriate as it rarely check for
inclusion, and would likely lead to unexpected results.

#=== has been used to check for inclusion in "case" statement, I think.
So it is not so inappropriate, imo.

case n
when 0... 5 then # ...
when 5...10 then # ...
when 10...20 then # ...
end

case var
when Integer then # ...
when String then # ...
end

--
Yusuke Endoh

=end

Actions #6

Updated by johan556 (Johan Holmberg) about 14 years ago

=begin
On Wed, Sep 22, 2010 at 1:48 AM, Yusuke ENDOH wrote:

Maybe if there is only one element to the right of "in",
 it should be checked if having an #include? method and then call it ?
(or check if it is Enumerable)

Hmm.  I received the similar opinion (via Japanese twitter).
I think that that makes the semantics complex (for me), but it is ok
as long as the behavior of usual cases is intuitive.

I also like the "x in foo" infix syntax. I would expect it to be a
membership test, for example in the following situations:

 an_arr = [11,22,33]
 a_set = Set.new([11,22,33])

 p (10 in an_arr)            # false
 p (11 in an_arr)            # true
 p (10 in a_set)             # false
 p (11 in a_set)             # true

and probably similarly for other "collection type objects" too (I
believe it works something like that in Python).

Regards,
/Johan Holmberg

=end

Actions #7

Updated by shyouhei (Shyouhei Urabe) about 14 years ago

  • Status changed from Open to Assigned

=begin

=end

Actions #8

Updated by duerst (Martin Dürst) about 14 years ago

=begin

On 2010/09/23 4:41, Johan Holmberg wrote:

On Wed, Sep 22, 2010 at 1:48 AM, Yusuke ENDOH wrote:

I also like the "x in foo" infix syntax. I would expect it to be a
membership test, for example in the following situations:

 an_arr = [11,22,33]
 a_set = Set.new([11,22,33])

 p (10 in an_arr)            # false
 p (11 in an_arr)            # true
 p (10 in a_set)             # false
 p (11 in a_set)             # true

and probably similarly for other "collection type objects" too (I
believe it works something like that in Python).

Just my two cents, but I don't see why this case is important enough to
warrant deviating from the usual, object-oriented syntax (which may not
always look optimal, but is easy and straightforward).

Regards, Martin.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp

=end

Actions #9

Updated by Eregon (Benoit Daloze) about 14 years ago

=begin
On 24 September 2010 09:28, "Martin J. Dürst" wrote:

On 2010/09/23 4:41, Johan Holmberg wrote:

On Wed, Sep 22, 2010 at 1:48 AM, Yusuke ENDOH  wrote:

I also like the "x in foo" infix syntax. I would expect it to be a
membership test, for example in the following situations:

    an_arr = [11,22,33]
    a_set = Set.new([11,22,33])

    p (10 in an_arr)            # false
    p (11 in an_arr)            # true
    p (10 in a_set)             # false
    p (11 in a_set)             # true

and probably similarly for other "collection type objects" too (I
believe it works something like that in Python).

Just my two cents, but I don't see why this case is important enough to
warrant deviating from the usual, object-oriented syntax (which may not
always look optimal, but is easy and straightforward).

Regards,    Martin.

--

It is indeed some kind of syntactic sugar, and you are probably right
to mention it, because maybe it should be implemented it with OO
syntax before.
The main problem being here to add a method to Object (which "pollute"
almost all objects, and might be irrelevant/useless for some).

This proposition avoid that, and I believe for this reason (and the
beauty of it) it is somehow better (but unexpected for OO, and making
a new particular case).

Regards,
B.D.

=end

Actions #10

Updated by jcangas (Jorge L. Cangas) about 14 years ago

=begin
I think is better separate both roles:
-for membrership operator use 'in?' as

item in? somearray

  • for enumeration use 'in' as

for(a in [1,2,3,4])

=end

Actions #11

Updated by mame (Yusuke Endoh) about 14 years ago

=begin
Hi,

2010/9/24 "Martin J. Dürst" :

Just my two cents, but I don't see why this case is important enough to
warrant deviating from the usual, object-oriented syntax (which may not
always look optimal, but is easy and straightforward).

Thank you for your comment.

Indeed, an idiom [a, b, c].include?(x) can be used as a substitute.
However, it has three problems:

  1. the word order is weird; x should appear first (at least, many
    people feel so)
  2. the idiom is too long (even though it is often used)
  3. it is inefficient; new array object is created every times

Also, x == a || x == b || x == c can be used. It has another problem:
it becomes very verbose when "x" is long.

 http_request.http_method == :get  ||
 http_request.http_method == :post ||
 http_request.http_method == :put  ||
 http_request.http_method == :delete

vs.

 http_request.http_method in :get, :post, :put, :delete

What is worse is that we often want to write this kind of code.
If we rarely wrote this kind of code, I would think that new syntax
was not needed.

--
Yusuke Endoh

=end

Actions #12

Updated by mame (Yusuke Endoh) about 14 years ago

=begin
Hi,

2010/9/29 Roger Pack :

I like it.  There's a kind of elegance in it

if a in [1,2,3]
   p ' it is in 1,2,3'
end

It works well with my head.  +1

Thanks, but it is a bit different from my original suggestion.
My suggestion does not require brackets:

if a in 1,2,3
p 'it is in 1,2,3'
end

a in [1,2,3] is slightly verbose (for me) and inefficient because
it creates new array object when evaluated.
If you want to write array there, you can use splat operator:

ary = [1,2,3]
if a in *ary
p 'it is in ary'
end

--
Yusuke Endoh

=end

Actions #13

Updated by duerst (Martin Dürst) about 14 years ago

=begin
Hello Yusuke,

On 2010/10/05 23:28, Yusuke ENDOH wrote:

Hi,

2010/9/24 "Martin J. Dürst":

Just my two cents, but I don't see why this case is important enough to
warrant deviating from the usual, object-oriented syntax (which may not
always look optimal, but is easy and straightforward).

Thank you for your comment.

Indeed, an idiom [a, b, c].include?(x) can be used as a substitute.
However, it has three problems:

  1. the word order is weird; x should appear first (at least, many
    people feel so)

I would understand that if it were [a, b, c].included? x
But it's include?, so the order seems just fine. Easy to read as
"does [a, b, c] include x?". Any other order would feel strange,
wouldn't it?

Also, an 'in?' method on Object has been proposed, so that you can write
x.in? [a, b, c]
That's very short, and fully object oriented pure Ruby, no syntactic
sugar necessary.

  1. the idiom is too long (even though it is often used)

I don't think Ruby method names are optimized according to usage
frequency. And where is "too long"? 'include?' is 8 characters. [If it
is really too long, what about maybe 'incl?'? (I don't like that
personally, but just in case.)]

  1. it is inefficient; new array object is created every times

That's a problem for a good compiler/interpreter. There are many cases
in Ruby where similar stuff happen, and nevertheless, many people are
using Ruby. If it really needs to be fast, why not use C or so?

Also, x == a || x == b || x == c can be used. It has another problem:
it becomes very verbose when "x" is long.

 http_request.http_method == :get  ||
 http_request.http_method == :post ||
 http_request.http_method == :put  ||
 http_request.http_method == :delete

vs.

 http_request.http_method in :get, :post, :put, :delete

Well, [:get, :post, :put, :delete].include?(http_request.http_method)
is still available. Just use that.

As for the 'new array object created every time' problem, the best thing
here would be to define something like:

HTTP::REST_METHODS = [:get, :post, :put, :delete]

and then later just do:
HTTP::REST_METHODS.include?(http_request.http_method)
or so.

Another way to do it would be:
case http_request.http_method
when :get, :post, :put, :delete
...
end

What is worse is that we often want to write this kind of code.
If we rarely wrote this kind of code, I would think that new syntax
was not needed.

There are many other methods in Ruby that are used very often. Do we
want to create special syntax for all of them?

Regards, Martin.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp

=end

Actions #14

Updated by mame (Yusuke Endoh) about 14 years ago

=begin
Hi,

2010/10/8 "Martin J. Dürst" :

I would understand that if it were [a, b, c].included? x
But it's include?, so the order seems just fine. Easy to read as
"does [a, b, c] include x?". Any other order would feel strange, wouldn't
it?

I'm not talking about English, but a "subject" of a sentence.
Because I think that the "subject" of this sentence is "x", I want to
write "x" first, such as "is x included in [a, b, c]?".

Consider Python's join: '-'.join(["a", "b", "c"])
I think that it is awkward NOT because it is unnatural English word order,
but because an array (that is the "subject" of this sentence) appears
later.

Also, an 'in?' method on Object has been proposed, so that you can write
x.in? [a, b, c]
That's very short, and fully object oriented pure Ruby, no syntactic sugar
necessary.

Some people say that it is against OO. They say that Object class should
not have "in?" method because "in?" is not a property of Object.
Personally, I'm not against Object#in?, but I can also understand their
opinions.

  2) the idiom is too long (even though it is often used)

I don't think Ruby method names are optimized according to usage frequency.

Though there are many exceptions, Ruby certainly has a design principle
("akr theory" called in [ruby-dev:33558]) that encouraged methods should
have short names.

An extreme example is [ruby-dev:33553]. matz once suggested String#sg
that is a reformed version of String#gsub. Though it was not committed.

Another way to do it would be:
   case http_request.http_method
   when :get, :post, :put, :delete
     ...
   end

I agree that case statement is a good idea. When I imformally suggested
"in?" operator (on IRC or twitter), some people also suggested me to use
case statement, and I was satisfied once.

But there is still two problems; case cannot be postpositive, and cannot
be used in else clauses (like "elsif").

An extreme example again: I heard that Sasada-san even created a patch for
postpositive case statement:

p "foo" case http_request.http_method when :get, :post, :put, :delete

I believe that this shows that many people suffer from the word order
problem, though "in" operator is much better than this syntax :-)

--
Yusuke Endoh

=end

Actions #15

Updated by duerst (Martin Dürst) about 14 years ago

=begin
Hello Yusuke,

On 2010/10/08 21:29, Yusuke ENDOH wrote:

Hi,

2010/10/8 "Martin J. Dürst":

I would understand that if it were [a, b, c].included? x
But it's include?, so the order seems just fine. Easy to read as
"does [a, b, c] include x?". Any other order would feel strange, wouldn't
it?

I'm not talking about English, but a "subject" of a sentence.
Because I think that the "subject" of this sentence is "x", I want to
write "x" first, such as "is x included in [a, b, c]?".

[a, b, c] includes x, not the other way round. Of course, you can change
the verb to passive voice (included) and make the former object (x) a
grammatical subject. But how do you expect people to deduce that you
think about it in the passive voice from the method name 'include'?

Consider Python's join: '-'.join(["a", "b", "c"])
I think that it is awkward NOT because it is unnatural English word order,
but because an array (that is the "subject" of this sentence) appears
later.

I think that you can both say "'-' joins the array" and "the array joins
itself with the '-'", so both Ruby and Python have a point. I think the
awkwardness (which I feel too) is mainly because we are used to Ruby,
not to Python.

Also, an 'in?' method on Object has been proposed, so that you can write
x.in? [a, b, c]
That's very short, and fully object oriented pure Ruby, no syntactic sugar
necessary.

Some people say that it is against OO. They say that Object class should
not have "in?" method because "in?" is not a property of Object.

Given that collections of various kinds are extremely important in
programming, it may not be too far-fetched to say that it's a property
of any Object in Ruby to be (potentially) included in a collection.

  1. the idiom is too long (even though it is often used)

I don't think Ruby method names are optimized according to usage frequency.

Though there are many exceptions, Ruby certainly has a design principle
("akr theory" called in [ruby-dev:33558]) that encouraged methods should
have short names.

Yes. Everything else being equal, that's a good policy to follow. But
Ruby naming doesn't go as far as dropping vowels and such the way Unix
commands do.

An extreme example is [ruby-dev:33553]. matz once suggested String#sg
that is a reformed version of String#gsub. Though it was not committed.

Good programmers know that program readability is important, and 'sg' is
definitely not readable.

Another way to do it would be:
case http_request.http_method
when :get, :post, :put, :delete
...
end

I agree that case statement is a good idea. When I imformally suggested
"in?" operator (on IRC or twitter), some people also suggested me to use
case statement, and I was satisfied once.

But there is still two problems; case cannot be postpositive, and cannot
be used in else clauses (like "elsif").

An extreme example again: I heard that Sasada-san even created a patch for
postpositive case statement:

p "foo" case http_request.http_method when :get, :post, :put, :delete

I believe that this shows that many people suffer from the word order
problem, though "in" operator is much better than this syntax :-)

That case statement would in my view not be as bad as the 'in' proposal.
The case statement just completes the postpositive versions of
if/unless/while. The 'in' is a totally new construction.

Regards, Martin.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp

=end

Actions #16

Updated by mame (Yusuke Endoh) about 14 years ago

=begin
Hi,

2010/10/9 "Martin J. Dürst" :

[a, b, c] includes x, not the other way round. Of course, you can change the
verb to passive voice (included) and make the former object (x) a
grammatical subject. But how do you expect people to deduce that you think
about it in the passive voice from the method name 'include'?

More straightforwardly (for me), I will write:

"is x (in) a, b or c?" 「x は a, b, c のいずれか?」

When we want to check whether http_method is :get or :post, is it natural
for English speaker to say:

"do :get or :post include the http_method ?"

?

I think that you can both say "'-' joins the array" and "the array joins
itself with the '-'", so both Ruby and Python have a point. I think the
awkwardness (which I feel too) is mainly because we are used to Ruby, not to
Python.

I admit it is one of the reasons of awkwardness. But I still think the
biggest reason is the word order. I'm curious to know whether

":get == http_method or :post == http_method"

is more natural (or equal) for you than

"http_method == :get or http_method == :post"

.

An extreme example again: I heard that Sasada-san even created a patch for
postpositive case statement:

  p "foo" case http_request.http_method when :get, :post, :put, :delete

I believe that this shows that many people suffer from the word order
problem, though "in" operator is much better than this syntax :-)

That case statement would in my view not be as bad as the 'in' proposal. The
case statement just completes the postpositive versions of if/unless/while.
The 'in' is a totally new construction.

That approach requires a new keyword "elscase." And, "if" and "case" is
too exclusive; it is cumbersome to combinate normal condition and case
condition, like this:

if (x in a, b, c) && y == 1
...
end

I believe that "case" statement is not originally designed for such a use
case. Rather, the intended use case of "case" statement is to jump
execution to multiple "when" clauses.
I think it is better to introduce a new construction than to force to
extend "case" statement.

--
Yusuke Endoh

=end

Updated by adgar (Michael Edgar) over 13 years ago

I personally believe in belongs as an operator, it should match natural, mathematical, set-inclusion notation, and it should invoke include?.

Many have discussed how it is just as possible to write "does S include x" as well as "is x in S": especially in English, there are many ways of writing things. As there are in Ruby! This should not distract us from why the idea has been proposed in the first place.

In mathematics, we very rarely write "Set S includes x", let alone as a predicate. Instead, we write, (in LaTeX), x \in S. This is because most commonly the focus of this predicate is the element in question: is it in the set or not? This is a property of the set, but the focus of discourse is the (potential) element. The Ruby method belongs on the set, OO-speaking, because it is in charge of the information involved. But it is not unreasonable to note that writing S.includes?(x) introduces a mismatch between how computer scientists typically consider such questions.

Ruby's OO syntax does not naturally allow us to express this "foo in S" idea, in that order, without adding a method to all potential elements. I don't believe introducing a new method, .in? is a good idea. There should be no reason to introduce a misleading method name that suggests an element might know what sets it is in. Instead, I think that introducing in as an syntactic construct (much like for loop syntax) is appropriate. Since Ruby already has an idiomatic inclusion method name, include?, I believe it should invoke that, right to left (foo in bar means bar.include?(foo)). Here, the parallel with for loop syntax becomes more clear.

For loops address the same concern: it makes sense to write, in English, "array, each of your elements, x, should do this" just as it makes sense to say "for each x in the array, do this". OO-style invocation supports the former, but not the latter, and so we have a syntax which reverses the order: in for loops, the receiver comes after the variable names it uses. This supports a different, natural way to describe loops. Just like in, it works by using a convention-based method name. For loops in Ruby are maligned primarily due to potentially-surprising scoping issues, but their syntax itself is subjectively attractive.

I have attached a patch which incorporates this approach. It includes the appropriate ripper event(s).

Updated by matz (Yukihiro Matsumoto) over 12 years ago

  • Description updated (diff)
  • Status changed from Assigned to Rejected

This proposal is only for cosmetics.
I don't want a new operator that does not introduce something new.

Matz.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0