Feature #20770
openA *new* pipe operator proposal
Description
Hello,
This is my first contribution here. I have seen previous discussions around introducing a pipe operator, but it seems the community didn't reach a consensus. I would like to revisit this idea with a simpler approach, more of a syntactic sugar that aligns with how other languages implement the pipe operator, but without making significant changes to Ruby's syntax.
Currently, we often write code like this:
value = half(square(add(value, 3)))
We can achieve the same result using the then
method:
value = value.then { add(_1, 3) }.then { square(_1) }.then { half(_1) }
While then
helps with readability, we can simplify it further using the proposed pipe operator:
value = add(value, 3) |> square(_1) |> half(_1)
Moreover, with the upcoming it
feature in Ruby 3.4 (#18980), the code could look even cleaner:
value = add(value, 3) |> square(it) |> half(it)
This proposal uses the anonymous block argument (_1)
, and with it
, it simplifies the code without introducing complex syntax changes. It would allow us to achieve the same results as in other languages that support pipe operators, but in a way that feels natural to Ruby, using existing constructs like then
underneath.
I believe this operator would enhance code readability and maintainability, especially in cases where multiple operations are chained together.
Thank you for considering this proposal!
Updated by nobu (Nobuyoshi Nakada) about 1 month ago
- Tracker changed from Bug to Feature
- ruby -v deleted (
3.3.5) - Backport deleted (
3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN)
In the previous trial syntax, the receiver of RHS was the result of LHS.
In your proposal, the receiver of RHS is the same as LHS, and the LHS result is passed as an implicit argument?
Updated by AlexandreMagro (Alexandre Magro) about 1 month ago
nobu (Nobuyoshi Nakada) wrote in #note-1:
In the previous trial syntax, the receiver of RHS was the result of LHS.
In your proposal, the receiver of RHS is the same as LHS, and the LHS result is passed as an implicit argument?
Exactly, this is the expected behavior of the pipe operator in other functional languages, such as Elixir. In those languages, the left-hand side (LHS) value is passed directly as an argument to the function on the right-hand side (RHS), either as the first or last argument depending on the language. For example, in Elixir, you might write:
value = value |> add(3) |> square() |> half()
My proposal for Ruby offers a more flexible approach. The LHS value can be passed as an explicit argument (using _1
or it
), allowing for greater control over how the RHS function handles the received value.
Additionally, this approach simplifies the implementation by treating RHS as executable block, just as we already do with .then
.
Updated by shuber (Sean Huber) about 1 month ago
I would still love to see this type of pipeline functionality implemented with plain expressions instead of new operators.
I have this (old) working proof of concept gem from years ago (basic syntax described below) but it was primarily focused on constant interception. I imagine it can be quite a bit more complex adding support for calling Proc objects and other edge cases.
"https://api.github.com/repos/ruby/ruby".pipe do
URI.parse
Net::HTTP.get
JSON.parse.fetch("stargazers_count")
yield_self { |n| "Ruby has #{n} stars" }
Kernel.puts
end
#=> Ruby has 22120 stars
-9.pipe { abs; Math.sqrt; to_i } #=> 3
[9, 64].map(&Math.pipe.sqrt.to_i.to_s) #=> ["3", "8"]
Most of the logic in that proof of concept was related to intercepting method calls to ALL constants which wouldn't be necessary if it was a core part of the language. The actual "pipeline" functionality (PipeOperator::Pipe
and PipeOperator::Closure
) is pretty simple - basically just keeping an array of constant+method+args calls and reduce
ing the result when the pipeline ends.
The proof of concept is basically prepend
ing a version of every method in every constant with something like the example below in order to support this "pipeline expressions" syntax:
define_method(method) do |*args, &block|
if Pipe.open
Pipe.new(self).__send__(method, *args, &block)
else
super(*args, &block)
end
end
Updated by bkuhlmann (Brooke Kuhlmann) about 1 month ago
For background, this has been discussed before:
- 15799: This was implemented and then reverted.
- 20580: This recently popped up as well.
- There are probably other issues that I'm forgetting about that have been logged on this subject.
Introducing |>
as an operator that works like #then
would be interesting and would be similar to how Elixir works, as Alexandre mentioned. This is also how Elm works where you can elegantly use |>
or <|
as mentioned in the Operators documentation.
I also use something similar to how Sean uses a #pipe
method with a block but mostly by refining the Symbol
class as documented here in my Refinements gem.
Also, similar to what Sean is describing, I provide the ability to pipe commands together without using |>
by using my Pipeable gem which builds upon native function composition to nice effect. Here's a snippet:
pipe data,
check(/Book.+Price/, :match?),
:parse,
map { |item| "#{item[:book]}: #{item[:price]}" }
In both cases (refining Symbol
or using Pipeable), the solution works great and provides and implements what is described here using different solutions. All solutions are fairly performant but would be neat if the performance could be improved further if there was a way to optimize these solutions natively in Ruby.
Updated by AlexandreMagro (Alexandre Magro) about 1 month ago
bkuhlmann (Brooke Kuhlmann) wrote in #note-4:
For background, this has been discussed before:
- 15799: This was implemented and then reverted.
- 20580: This recently popped up as well.
- There are probably other issues that I'm forgetting about that have been logged on this subject.
Introducing
|>
as an operator that works like#then
would be interesting and would be similar to how Elixir works, as Alexandre mentioned. This is also how Elm works where you can elegantly use|>
or<|
as mentioned in the Operators documentation.I also use something similar to how Sean uses a
#pipe
method with a block but mostly by refining theSymbol
class as documented here in my Refinements gem.Also, similar to what Sean is describing, I provide the ability to pipe commands together without using
|>
by using my Pipeable gem which builds upon native function composition to nice effect. Here's a snippet:pipe data, check(/Book.+Price/, :match?), :parse, map { |item| "#{item[:book]}: #{item[:price]}" }
In both cases (refining
Symbol
or using Pipeable), the solution works great and provides and implements what is described here using different solutions. All solutions are fairly performant but would be neat if the performance could be improved further if there was a way to optimize these solutions natively in Ruby.
One issue with .pipe
is that it mixes two approaches: the object method chain (lhs.rhs
) and passing the result as an argument (rhs(lhs)
). This inconsistency can be a bit confusing because it shifts between the two styles, making it harder to follow the flow.
in the .pipe
version:
"https://api.github.com/repos/ruby/ruby".pipe do
URI.parse
Net::HTTP.get
JSON.parse.fetch("stargazers_count")
yield_self { |n| "Ruby has #{n} stars" }
Kernel.puts
end
With a pipe operator, we can achieve the same result in a more consistent and readable way:
"https://api.github.com/repos/ruby/ruby"
|> URI.parse(it)
|> Net::HTTP.get(it)
|> JSON.parse(it).fetch("stargazers_count")
|> puts "Ruby has #{_1} stars"
This keeps the flow of passing the result from one step to the next clear and consistent, making the code easier to read and maintain. The pipe operator doesn’t add any extra complexity to method calls and provides more flexibility regarding how the "piped" value is used, making it feel more natural in the Ruby syntax.
Updated by vo.x (Vit Ondruch) about 1 month ago · Edited
Code like add(value, 3)
is hardly some idiomatic Ruby. If it was Ruby, then you'd likely use value.add(3)
or value + 3
. Other examples of readable code are here. I can't see what is readable about the new operator.
Also, I'd say that Math
module is bad example in general, because it seems to be influenced by commonly used math notation. But arguably, having something like Math::PI.cos
or 3.14.cos
would be quite natural for Ruby.
Updated by AlexandreMagro (Alexandre Magro) about 1 month ago
vo.x (Vit Ondruch) wrote in #note-6:
Code like
add(value, 3)
is hardly some idiomatic Ruby. If it was Ruby, then you'd likely usevalue.add(3)
orvalue + 3
. Other examples of readable code are here. I can't see what is readable about the new operator.Also, I'd say that
Math
module is bad example in general, because it seems to be influenced by commonly used math notation. But arguably, having something likeMath::PI.cos
or3.14.cos
would be quite natural for Ruby.
I believe there’s a misunderstanding here. The example add(value, 3)
is not intended to represent an idiomatic Ruby expression, like value + 3
. Rather, it illustrates how a method call that modifies or processes a value would work within a pipeline.
Using the pipe operator is helpful for showing the order of executions. For example, if you want to execute a function f
followed by g
, you could write:
g(f(x))
However, it's easier to follow the order of executions (e.g., f and then g) when written like this:
x |> f |> g
In real-world scenarios, especially when working with APIs or complex transformations, it's common to prepare data step by step before reaching the final function. Instead of using intermediate variables, which might only be used once, the pipe operator offers a clearer and more efficient solution. For instance, consider fetching and processing data from a client API:
response = URI.parse(client_api_url)
response = Net::HTTP.get(response)
response = JSON.parse(response).fetch("client_data")
puts "Client info: #{response}"
With the pipe operator, the same logic can be simplified and made more readable:
client_api_url
|> URI.parse(it)
|> Net::HTTP.get(it)
|> JSON.parse(it).fetch(important_key)
This approach not only avoids unnecessary variables but also makes the flow of data through the pipeline much clearer. The pipe operator simplifies this pattern and ensures readability, without adding complexity to method calls. It also provides flexibility in how the "passed" value is used throughout the steps.
Again, these are simplified examples of real-world problems, where the pipe operator can help streamline and clarify otherwise convoluted method chains.
Updated by ufuk (Ufuk Kayserilioglu) about 1 month ago
AlexandreMagro (Alexandre Magro) wrote in #note-7:
With the pipe operator, the same logic can be simplified and made more readable:
client_api_url |> URI.parse(it) |> Net::HTTP.get(it) |> JSON.parse(it).fetch(important_key)
I would like to note that this almost works already today:
irb> client_api_url = "https://jsonplaceholder.typicode.com/posts/1"
#=> "https://jsonplaceholder.typicode.com/posts/1"
irb> pipeline = URI.method(:parse) >> Net::HTTP.method(:get) >> JSON.method(:parse)
#=> #<Proc:0x000000012c62b4e8 (lambda)>
irb> pipeline.call(client_api_url)
#=>
{"userId"=>1,
"id"=>1,
"title"=>"sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
"body"=>
"quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"}
irb> pipeline = URI.method(:parse) >> Net::HTTP.method(:get) >> JSON.method(:parse) >> -> { it.fetch("title") }
#=> #<Proc:0x000000012c4c2778 (lambda)>
irb> pipeline.call(client_api_url)
#=> "sunt aut facere repellat provident occaecati excepturi optio reprehenderit"
You can also make the whole pipeline with just using procs:
(-> { URI.parse(it) } >> -> { Net::HTTP.get(it) } >> -> { JSON.parse(it) } >> -> { it.fetch("title") }).call(client_api_url)
#=> "sunt aut facere repellat provident occaecati excepturi optio reprehenderit"
which is much closer to the syntax that you want, except for the lambda wrappers.
I think with Proc#>>
and Proc#<<
this need for chaining is mostly in place already. The thing that is really missing is the ability to access a method by name without having to do .method(:name)
which was proposed in https://bugs.ruby-lang.org/issues/16264. That proposal would make the first example be:
(URI.:parse >> Net::HTTP.:get >> JSON.:parse >> -> { it.fetch("title") }).call(client_api_url)
#=> "sunt aut facere repellat provident occaecati excepturi optio reprehenderit"
which looks much nicer.
Updated by AlexandreMagro (Alexandre Magro) about 1 month ago
ufuk (Ufuk Kayserilioglu) wrote in #note-8:
You can also make the whole pipeline with just using procs:
(-> { URI.parse(it) } >> -> { Net::HTTP.get(it) } >> -> { JSON.parse(it) } >> -> { it.fetch("title") }).call(client_api_url) #=> "sunt aut facere repellat provident occaecati excepturi optio reprehenderit"
Yes, and it's also possible to achieve this with a chain of .then
, which results in a similar structure. The idea of the pipe operator is to be syntactic sugar, bringing functionality from functional languages into Ruby without introducing any complexity, while maintaining ruby's simplicity.
client_api_url
.then { URI.parse(it) }
.then { Net::HTTP.get(it) }
.then { JSON.parse(it).fetch(important_key) }
Updated by jeremyevans0 (Jeremy Evans) about 1 month ago
AlexandreMagro (Alexandre Magro) wrote in #note-9:
Yes, and it's also possible to achieve this with a chain of
.then
, which results in a similar structure. The idea of the pipe operator is to be syntactic sugar, bringing functionality from functional languages into Ruby without introducing any complexity, while maintaining ruby's simplicity.client_api_url .then { URI.parse(it) } .then { Net::HTTP.get(it) } .then { JSON.parse(it).fetch(important_key) }
We could expand the syntax to treat .{}
as .then{}
, similar to how .()
is .call()
. With that, you could do:
client_api_url
.{ URI.parse(it) }
.{ Net::HTTP.get(it) }
.{ JSON.parse(it).fetch(important_key) }
Which is almost as low of a syntatic overhead as you would want.
Note that we are still in a syntax moratorium, so it's probably better to wait until after that is over and we have crowned the one true parser before seriously considering new syntax.
Updated by AlexandreMagro (Alexandre Magro) about 1 month ago
jeremyevans0 (Jeremy Evans) wrote in #note-10:
We could expand the syntax to treat
.{}
as.then{}
, similar to how.()
is.call()
. With that, you could do:client_api_url .{ URI.parse(it) } .{ Net::HTTP.get(it) } .{ JSON.parse(it).fetch(important_key) }
Which is almost as low of a syntatic overhead as you would want.
Note that we are still in a syntax moratorium, so it's probably better to wait until after that is over and we have crowned the one true parser before seriously considering new syntax.
The idea of using .{}
is really creative, but it feels somewhat unintuitive. On the other hand, the pipe operator is a well-established concept, which would ease adoption.
Updated by ko1 (Koichi Sasada) about 1 month ago
FYI: https://github.com/tc39/proposal-pipeline-operator
is similar idea.
Updated by mame (Yusuke Endoh) about 1 month ago
When pipeline operator was proposed previously (#15799), we briefly spoke of the idea of a block notation without a closing bracket (the meeting log).
For example,
add(value, 3).then do |x|> square(x)
is interpreted as:
add(value, 3).then {|x| square(x) }
However, this notation is a bit outlandish, so it was never taken very seriously.
Reconsidering it with the notation proposed in this ticket:
add(value, 3).then |> square(it).then |> half(it)
is handled as:
add(value, 3).then { square(it).then { half(it) } } # Or:
add(value, 3).then { square(it) }.then { half(it) } # depending on the associativity of |>. I am not sure which is better
It might be a good idea that we specialize this notation only for a block that is so simple that we don't need to name the parameters.
But personally, I also feel that:
value = add(value, 3)
value = square(value)
value = half(value)
is good enough.
Updated by vo.x (Vit Ondruch) about 1 month ago
AlexandreMagro (Alexandre Magro) wrote in #note-7:
To me it just demonstrates that the APIs are likely incomplete and don't provide methods for easy conversion. We have a lot of conversion methods such as #to_str
, #to_json
, ... But there is no implicit transition from having e.g. String
object to URI
. I'd rather see something like client_api_url.to(URI)
which could be equivalent of URI(client_api_url)
.
I also like the example provided by @ufuk (Ufuk Kayserilioglu)
Updated by vo.x (Vit Ondruch) about 1 month ago
Not mentioning, the example ignores error handling, which would be IMHO the biggest problem in real life example
Updated by zverok (Victor Shepelev) about 1 month ago
We could expand the syntax to treat
.{}
as.then{}
, similar to how.()
is.call()
.
I really like this idea. Yes, it is not “how it is in other languages” yet it has a deep internal consistency with other language elements and easy to understand—both for people and to automatic analysis tools, with no ambiguities about what’s allowed as a step of such “pipeline” and what’s not, what’s the scope of used names, where the expression ends and so on.
This is awesome, actually.
Updated by zverok (Victor Shepelev) about 1 month ago
vo.x (Vit Ondruch) wrote in #note-14:
AlexandreMagro (Alexandre Magro) wrote in #note-7:
To me it just demonstrates that the APIs are likely incomplete and don't provide methods for easy conversion. We have a lot of conversion methods such as
#to_str
,#to_json
, ... But there is no implicit transition from having e.g.String
object toURI
. I'd rather see something likeclient_api_url.to(URI)
which could be equivalent ofURI(client_api_url)
.
I don’t think it is realistic, generally. I mean, convert every f(g(x))
to “x
should have method g
, and the result should have method f
, so you can write x.g.f
always (or in most widespread situations)”.
Many possible cases can be argued about, but 1) the argument would not necessarily demonstrate that API change is reasonable, and 2) even when reasonable, it is not always possible.
Say, if we take the sequence that is mentioned several times already (string → URL → HTTP get → JSON parse), then both concerns apply:
-
String#to_url
(orString#to(URL)
might be reasonable;#HTTPResponse#parse_json
... maybe too; butURL#http_get
?.. Not everybody would agree. - Even if agreeing on adding all those methods in principle, what about using a different HTTP library or a different JSON parser, what’s the answer would be?.. Something like
URL#http_get(with: Typhoeus)
orURL#typhoeus_get
added for every library? Adding local refinements to handle that depending on the library? What if the HTTP library used depends on dynamic parameters?..
So, while I agree that many APIs in Ruby have an intuition of the “object at hand has all the methods you need for the next step”, in large realistic codebases, it is not so (both technically and ideologically), and then { DifferentDomain.handle(it) }
is a very widespread way to mitigate that.
Updated by vo.x (Vit Ondruch) about 1 month ago
zverok (Victor Shepelev) wrote in #note-17:
I don’t think it is realistic, generally. I mean, convert every
f(g(x))
to “x
should have methodg
, and the result should have methodf
, so you can writex.g.f
always (or in most widespread situations)”.
Right, this was far fetched and would not work admittedly. But that is why I proposed the client_api_url.to(URI)
, because after all, this is IMHO mostly about type conversion. Why would I ever want to call something like URI.parse(it)
? Why would I need to know there is parse
method and why would I need to put it
/ _1
multiple times everywhere and every time in different context.
Updated by AlexandreMagro (Alexandre Magro) about 1 month ago
vo.x (Vit Ondruch) wrote in #note-18:
Right, this was far fetched and would not work admittedly. But that is why I proposed the
client_api_url.to(URI)
, because after all, this is IMHO mostly about type conversion. Why would I ever want to call something likeURI.parse(it)
? Why would I need to know there isparse
method and why would I need to putit
/_1
multiple times everywhere and every time in different context.
Zverok was precise in his comment.
I understand your point, but the idea of to(URI) introduces an inversion of responsibility, which can lead to dependency inversion issues — a poor practice in software design, especially when working with different libraries.
It's unclear what you mean by client_api_url
in this context since, in my example, it was simply a string. Having a .to method on a string seems generic and nonsensical.
As for the question "Why would I ever want to call something like URI.parse(it)?", code is already written this way. The pipe operator doesn’t change the syntax but rather inverts the reading flow.
Lastly, the pipe operator is a well-established concept that aims to streamline existing Ruby syntax, not alter it.
client_api_url
|> URI.parse(it)
|> Net::HTTP.get(it)
|> JSON.parse(it).fetch(important_key)
This is so clean. It's just Ruby.
Updated by Dan0042 (Daniel DeLorme) about 1 month ago
I'm not a big fan of this pipe operator idea, but at least the idea of using it
is a good one; it solves many problems with previous proposals.
foo = 42
1 |> foo |> BAR
#foo should be localvar but somehow is parsed as method here?
#BAR should be constant but somehow is parsed as method here?
1 |> foo(it) |> BAR(it)
#at least foo and BAR are recognizably methods
1 |> foo(it, 2)
2 |> foo(1, it)
hash |> BAR(**it)
#also, it allows flexibility in how the argument is passed
But that being said, this doesn't seem to be so useful to me. If we compare "before" and "after" the pipe operator:
#current
client_api_url
.then{ URI.parse(it) }
.then{ Net::HTTP.get(it) }
.then{ JSON.parse(it).fetch(important_key) }
#with |> syntax sugar
client_api_url
|> URI.parse(it)
|> Net::HTTP.get(it)
|> JSON.parse(it).fetch(important_key)
It really doesn't seem to me that readability is increased in any meaningful way. The benefit seems way too low to justify adding new syntax.
Languages with the pipe operator all have first-class functions (afaik); the two kinda go together. But Ruby doesn't have first-class functions so the usefulness of the pipe operator will inevitably be very limited.
If the pipe operator is introduced I think it should behave similarly to other languages, where the RHS is a callable object. In fact if we define the pipe operator as invoking #call or #bind_call on the RHS, I could see the beginning of a feature that is more useful than just syntax sugar.
str |> JSON.method(:parse)
1 |> Object.instance_method(:to_s) #=> "#<Integer:0x0000000000000003>"
#and now we just need nice shorthands for Mod.method(:name) and Mod.instance_method(:name) ;-)
Updated by ufuk (Ufuk Kayserilioglu) about 1 month ago
I tend to agree with @Dan0042 (Daniel DeLorme) on this one, this seems to go against the nature of Ruby. In Ruby, an expression like URI.parse(it)
is always eagerly evaluated, except when it is inside a block. This is not true in other languages; ones that make a distinction between Foo.bar
and Foo.bar()
, for example. This proposal, however, is adding a new conceptual context in which the evaluation would be delayed, which would be in a sequence of pipeline operators. I am not sure if I like that, to be honest.
In contrast, I like @jeremyevans0 (Jeremy Evans) 's suggestion to add syntactic sugar to .then
method in the form of .{}
which still keeps the block as the only construct that would delay the evaluation of methods, and it allows the use of numbered block parameters and/or it
inside such blocks without any other changes to the language.
Updated by austin (Austin Ziegler) about 1 month ago
I think that this is one of the more interesting approaches to a pipeline operator in Ruby as it is just syntax sugar. As I am understanding it:
foo
|> bar(_1, baz)
|> hoge(_1, quux)
would be treated by the parser to be the same as:
foo
.then { bar(_1, baz) }
.then { hoge(_1, quux) }
It would be nice (given that there syntax sugaring happening here) that if it
or _1
is missing, it is implicitly inserted as the first parameter:
foo
|> bar(baz)
|> hoge(quux)
==
foo
.then { bar(_1, baz) }
.then { hoge(_1, quux) }
This would enable the use of callables (procs and un/bound methods) as suggested by @Dan0042 (Daniel DeLorme) in #note-20.
I am not sure that without that implicit first parameter, the potential confusion introduced by the differently-shaped blocks is worthwhile. Regardless, as someone who maintains libraries that with deep compatibility, I won't be able to use this in those for another decade at least (I still haven't released versions of my most used libraries that are 3.x only), by which time I am hoping to have found someone else to maintain them.
vo.x (Vit Ondruch) wrote in #note-18:
[the pipe operator] is IMHO mostly about type conversion
Having used Elixir heavily for the last seven years, I do not agree with this description. It can be, and the examples in question might be, but it's used equally in transformation (type conversion) and in context passing. Plug
(more or less the Elixir equivalent to Rack) is composable because the first parameter to every plug function (whether a function/2
or a module with init/1
and call/2
) is a Plug.Conn
struct, allowing code like this:
def call(conn, %Config{} = config) do
{metadata, span_context} =
start_span(:plug, %{conn: conn, options: Config.telemetry_context(config)})
conn =
register_before_send(conn, fn conn ->
stop_span(span_context, Map.put(metadata, :conn, conn))
conn
end)
results =
conn
|> verify_request_headers(config)
|> Map.new()
conn
|> put_private(config.name, results)
|> dispatch_results(config)
|> dispatch_on_resolution(config.on_resolution)
end
This is no different than:
def call(conn, %Config{} = config) do
{metadata, span_context} =
start_span(:plug, %{conn: conn, options: Config.telemetry_context(config)})
conn =
register_before_send(conn, fn conn ->
stop_span(span_context, Map.put(metadata, :conn, conn))
conn
end)
results = verify_request_headers(conn, config)
results = Map.new(results)
conn = put_private(conn, config.name, results)
conn = dispatch_results(conn, config)
dispatch_on_resolution(conn, config.on_resolution)
end
I find the former much more readable, because it's more data oriented and indicates that the data flows through the pipe — where it might be transformed (conn |> verify_request_headers(…) |> Map.new()
) or it might just be modifying the input parameter (conn |> put_private(…) |> dispatch_results(…) |> dispatch_on_resolution(…)
).
jeremyevans0 (Jeremy Evans) wrote in #note-10:
We could expand the syntax to treat
.{}
as.then{}
, similar to how.()
is.call()
. With that, you could do:client_api_url .{ URI.parse(it) } .{ Net::HTTP.get(it) } .{ JSON.parse(it).fetch(important_key) }
Which is almost as low of a syntatic overhead as you would want.
Note that we are still in a syntax moratorium, so it's probably better to wait until after that is over and we have crowned the one true parser before seriously considering new syntax.
This is … interesting. The biggest problem with it (from my perspective) is that it would privilege {}
blocks with this form, because do
is a valid method name, so .do URI.parse(it) end
likely be a syntax error. That and the fact that it would be nearly a decade before it could be used by my libraries.
Updated by AlexandreMagro (Alexandre Magro) about 1 month ago
ufuk (Ufuk Kayserilioglu) wrote in #note-21:
I tend to agree with @Dan0042 (Daniel DeLorme) on this one, this seems to go against the nature of Ruby. In Ruby, an expression like
URI.parse(it)
is always eagerly evaluated, except when it is inside a block. This is not true in other languages; ones that make a distinction betweenFoo.bar
andFoo.bar()
, for example. This proposal, however, is adding a new conceptual context in which the evaluation would be delayed, which would be in a sequence of pipeline operators. I am not sure if I like that, to be honest.
Actually, with the pipe operator, URI.parse(it)
is also inside a block, but the block is implicit.
The block spans from the pipe operator itself to the next pipe operator or a new line, making it simpler and more concise without changing the evaluation flow.
Updated by Eregon (Benoit Daloze) about 1 month ago
One concern with so many then {}
is that's a non-trivial overhead for execution (2 method calls + 1 block call for then { foo(it) }
vs 1 method call for foo(var)
).
So if it's added I think it should translate to the same as using local variables and not then {}
blocks.
I would write that snippet like this:
json = Net::HTTP.get(URI.parse(client_api_url))
JSON.parse(json).fetch(important_key)
2 lines of code vs 4, and IMO just as readable if not better.
So in my opinion there is no need for a pipeline operator for this.
Also I would think in real code one would probably want to rescue
some exceptions there, and so the pipeline wouldn't gain much visually and might need to be broken down in several parts anyway.
Updated by zverok (Victor Shepelev) about 1 month ago
@Eregon (Benoit Daloze) this example (at least for me) is just an easy target for discussion (because it uses standard libraries, is easily reproducible, and demonstrates the multi-step realistic process that uses several libraries at once).
I believe the point here is not “how it could be rewritten in non-flow-style,” but rather “many people in many codebases find flow-style useful, should we have a syntax sugar for it?”
I can confirm that for me (and many colleagues who were exposed to this style), it seems a more convenient way, especially to structure business code or quick sketching. It also might have a positive effect on overall algorithm structuring: the code author starts to think in “sequence of steps” terms, and (again, especially in complicated business code developed rapidly) it provides some protection against messy methods, where many local variables are calculated, and soon it is hard to tell which of them related to which of the next steps and how many flows are there.
I think it is also very natural to Ruby, considering one of the things we have different than many other languages is Enumerable as the center cycle structure, which supports chains of sequence transformations... So, then
is just a chain of singular value transformations.
But I think it is not necessary to prefer this style yourself to acknowledge others find it useful. (Well, alternatively, it could be a discussion like “nobody should do that, it shouldn’t be preferred/supported style,” but that’s another discussion.)
Updated by eightbitraptor (Matt V-H) about 1 month ago · Edited
The Ruby-lang homepage states that Ruby has
a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write.
And on the about page:
Ruby often uses very limited punctuation and usually prefers English keywords, some punctuation is used to decorate Ruby.
In my opinion this proposal conflicts with this description because:
-
|>
is less natural to read than the English wordthen
.then
has a clear and unambiguous meaning,|>
is an arbitrary combination of symbols that developers need to learn. -
|>
masks complexity - requiring users to learn and remember knowledge that could be easily read from the source code.
I don't understand, from reading this discussion, what benefit we would gain from writing the proposed:
client_api_url
|> URI.parse(it)
|> Net::HTTP.get(it)
|> JSON.parse(it).fetch(important_key)
especially when, as has already been pointed out, we can do this in the current version:
client_api_url
.then { URI.parse(it) }
.then { Net::HTTP.get(it) }
.then { JSON.parse(it).fetch(important_key) }
which is arguably more readable, and more intention revealing (for those of us unfamiliar with this Elixir).
Lastly
bringing functionality from functional languages into Ruby without introducing any complexity, while maintaining ruby's simplicity.
This isn't importing functionality from other languages, merely syntax. I'm against adopting syntax if there isn't a clear (and preferable measurable) benefit to the Ruby ecosystem.
Updated by AlexandreMagro (Alexandre Magro) about 1 month ago
I strongly agree that new additions should be thoroughly evaluated and aligned with the philosophy of the language ("A programmer's best friend"). I've found the discussion so far to be very productive, and my opinion is that:
I don't see |>
as "an arbitrary combination of symbols". I believe the pipe operator is a well-established concept, predating Ruby itself, and symbolic usage to express certain expressions is already present in the language, such as &:method_name
instead of { |x| x.method_name }
.
Updated by zverok (Victor Shepelev) about 1 month ago
A couple of my counterpoints to |>
(and towards .{}
, if we do need syntax sugar in this place at all):
While |>
sure exists in other languages, we need to look into how it plays with the rest of the code/semantics of our language (because in languages where it exists, it is typically supported by many small and large semantical facts).
Say, in Elixir, one might write this (not precise code, writing kind-of pseudocode from the top of my head):
row
|> String.split('|')
|> Enumerable.map(fn x -> parse(x) end)
|> Enumerable.filter(&Number.odd?)
|> MyModule.process_numbers
|> String.join('-')
In Ruby, the equivalent would be mostly with “current object’s methods”, as @vo.x (Vit Ondruch) notes, with .then
occasionally compensating when you need to use another module:
row
.split('|')
.map { parse(it) }
.filter(&:odd?)
.then { MyModule.process_numbers(it) }
.join('-')
What would |>
bring here?
row
.split('|')
.map { parse(it) }
.filter(&:odd?)
|> MyModule.process_numbers(it)
.join('-')
In my view, only syntactical/semantic confusion (what’s the scope in |>
line? is join
attached to its result, or is it inside the „invisible block”?.. Why do we have a fancy symbol for .then
, but not for map
or filter
, which are arguably even more widespread?..)
Every time the topic arises, I am confused about it the same way. It seems like just chasing “what others have,” without much strong argument other than “but others do it this way.” But I might really miss something here.
Updated by shuber (Sean Huber) about 1 month ago · Edited
I agree with @zverok (Victor Shepelev) and am not quite sold on the value of |>
over the existing .then{}
if we still have to explicitly specify implicit args like it/_1/etc
(unlike elixir).
I am intrigued by the .{}
syntax though but wish it did more than behave as an alias for .then{}
.
What if .{}
behaved more like this elixir-style syntax without implicit args?
# existing ruby syntax
url
.then { URI.parse(it) }
.then { Net::HTTP.get(it) }
.then { JSON.parse(it).fetch_values("some", "keys") }
.then { JSON.pretty_generate(it, allow_nan: false) }
.then { Example.with_non_default_arg_positioning(other_object, it) }
# proposed ruby syntax
url
.{ URI.parse }
.{ Net::HTTP.get }
.{ JSON.parse.fetch_values("some", "keys") }
.{ JSON.pretty_generate(allow_nan: false) }
.{ Example.with_non_default_arg_positioning(other_object, self) }
# one line chaining example
"-9".abs.{Math.sqrt}.to_i.to_s #=> "3"
# maybe support to_proc as well
[9].map(&{Math.sqrt.to_i.to_s}) #=> ["3"]
Updated by AlexandreMagro (Alexandre Magro) about 1 month ago · Edited
zverok (Victor Shepelev) wrote in #note-28:
What would
|>
bring here?row .split('|') .map { parse(it) } .filter(&:odd?) |> MyModule.process_numbers(it) .join('-')
In my view, only syntactical/semantic confusion (what’s the scope in
|>
line? isjoin
attached to its result, or is it inside the „invisible block”?.. Why do we have a fancy symbol for.then
, but not formap
orfilter
, which are arguably even more widespread?..)
I’d like to turn the question around and ask what would be returned from the following code?
array_a = [{ name: 'A', points: 30 }, { name: 'B', points: 20 }, { name: 'C', points: 10 }]
array_b = [{ name: 'D', points: 0 }, { name: 'E', points: 0 }]
array_c = array_a
.sort { |a, b| b[:points] <=> a[:points] }
+ array_b
.map { |el| el[:name] }
This highlights that mixing operators and methods within a chain can indeed create confusion. The example is tricky because it's not clear if the .map
will apply to array_b or to array_a after it has been sorted and concatenated with array_b.
In the same way, the |>
operator might introduce confusion if it's mixed in with method chains without proper context. However, just like +
, |>
is simply another operator. It can be understood like:
-
a |> b
translates to something like->(a) { b }
. - Similarly,
a + b
is->(a, b) { a + b }
.
In both your example and mine, the operators (|>
and +
) could simply be replaced with appropriate methods (then
and concat
, respectively), depending on the context and desired functionality.
Updated by zverok (Victor Shepelev) about 1 month ago
@AlexandreMagro (Alexandre Magro) I don’t think this analogy is suitable here.
Of course, there are operators that aren’t convenient to use in chaining (though, I should admit to the sin of sometimes just using the.chain.with.+(argument).like.that
, and it works and follows the existing Ruby semantics and intuitions, even if not to everybody’s liking).
But my point was that the proposed construct is specifically for easier chaining but doesn’t fall in line with any other Ruby’s tool for that. I think a comparison with Elixir demonstrates that.
In Elixir, you’ll say, “see, whatever you need to do with the value, just do with more |>
, it is all the same.”
In Ruby, you say “when you work with collections, you do .method
and blocks; when you work with methods object already has, you do .method
; when you need debug print in the middle of the chain, you can .tap { p _1 }
just like that... But oh, there is also this one nice operator which you can’t mix with anything but it is there too... And it also creates an invisible block like nowhere else, but it is just there for convenience and looking like Elixir, sometimes!”
That’s the major drawback of the proposal in my eyes, and I fail to see a comparably major gain.
Updated by lpogic (Łukasz Pomietło) about 1 month ago · Edited
Has "then" but as a keyword been considered?
In the basic version it could appear as a "begin..then..end" block:
value = begin add value, 3 then square it then half it end
It looks like syntax highlighting is ready. "begin" can be replaced with something else, but then it would be harder to prove such forms:
value = begin value
then add it, 3
then square it
then |v| # optional 'it' name?
half v
rescue # optional error handling?
puts "Error"
0
end
def foo(value)
add value, 3
then
square it
then
half it
end
The endless (and beginless) version may be more controversial, but if used with caution it could make sense:
value = add value, 3 then square it then half it
Going further, why couldn't "then" be a LHS result? It has the potential to be a cure for parenthesis headaches:
(1..5).to_a.join("-").then{ puts it } # => 1-2-3-4-5
# ^ == v
1..5 then.to_a.join "-" then puts it # => 1-2-3-4-5
puts (2 + 2 then * 2 - 2 then ** 2) == (((2 + 2) * 2 - 2) ** 2) # => true
Updated by nevans (Nicholas Evans) about 1 month ago
I think there are good reasons to want a |>
operator in addition to (or instead of) .{}
, but foo.{ bar it }
is intriguing syntactic sugar. I think I like it. I just noticed that it was rejected by Matz when #yield_self
was introduced. But perhaps (when the syntax moratorium has ended) time will have changed his mind? It does seem to have a natural connection to foo.()
.
But, I would strongly prefer for it to be an alias for #yield_self
; not for #then
. Maybe that's a subtle distinction. Many rubyists seem to treat #then
as a pure alias for #yield_self
. But they are not perfect synonyms. When #then
was first proposed, Matz specifically mentioned that they have different semantics:
It is introduced that a normal object can behave like promises.
So the name conflict is intentional.
If you really wanted a non-unwrapping method for promises, useyield_self
.
In other words, we should not assume that every object implements #then
the exact same way. I have a lot of async code that predates Object#then
. From a purely linguistic viewpoint, when we're dealing with a object that represents a completable process, the English word "then" strongly implies that the block will only run after the process has completed.
So I treat #yield_self
and #then
the same way that I treat equal?
, eql?
, ==
, and #===
. The fact that all of these behave more-or-less identically on Object is not determinative: classes should override #eql?
, #==
, and #===
to properly represent the different forms of equality. Likewise, #then
should be overridden for any object that represents a completable process. On the other hand, just like #equal?
, #yield_self
should never be overridden, and it should only occasionally even be used.
I will use #equal?
or #yield_self
when the semantics fit, even if that particular object doesn't override #==
and #then
. E.g:
# runs immediately: so "then" is not appropriate
Thread.new do do_stuff end
.yield_self { register_task_from_thread it }
# waits for `Thread#value`: so "then" is appropriate
Thread.new do do_stuff end
.then { handle_result it.value }
async { get_result } # returns a promise
.then {|result| use result } # probably _also_ returns a promise
.value # unwrap the promise
I do think there is room for a |>
operator that is yet another version of this, with slightly different semantics from both #yield_self
and #then
. But (concerning this proposal) I share @zverok's concern about creating "an invisible block like nowhere else". We should be very careful about adding unique syntax for a single operator.
Updated by AlexandreMagro (Alexandre Magro) 21 days ago
Reflecting on the opposing points raised, I believe the pipe operator could work differently, avoiding the issue of "implicit blocks" mentioned by @zverok (Victor Shepelev).
As suggested by @Eregon (Benoit Daloze), translating the operator to local variables reduces the overhead associated with chaining .then
.
What I (re)propose is to define the pipe operator as a statement separator, similar to ;
, where LHS
expression is evaluated first and its result is stored in the variable _
, which we can call as "last expression result", and then RHS
is executed.
For instance, this:
expr_a |> expr_b
Would conceptually translates to:
expr_a => _; expr_b
This way, we could write:
"https://api.github.com/repos/ruby/ruby"
|> URI.parse(_)
|> Net::HTTP.get(_)
|> JSON.parse(_)
|> _.fetch("stargazers_count")
|> puts "Ruby has #{_} stars"
This approach maintains clarity, avoids the overhead of multiple .then
calls, and introduces the _
variable as the last expression result, similar to the "ANS" button on a calculator.