Project

General

Profile

Feature #16494

Allow hash unpacking in non-lambda Proc

Added by zverok (Victor Shepelev) 7 months ago. Updated 4 months ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:96731]

Description

First of all, I fully understand the value of separating "real" keyword arguments and disallowing implicit and unexpected conversions to/from hashes.

There is, though, one convenient style which is now broken:

# words is array of hashes:
words
  .map { |text:, paragraph_id:, **rest| 
    {text: text.strip, paragraph_id: paragraph_id.to_i, **rest}
  }
  .reject { |text:, is_punctuation: false, **| text.end_with?('!') || is_punctuation }
  .chunk { |paragraph_id:, timestamp: 0, **| [paragraph_id, timestamp % 60] }
  # ...and so on

There is several important elements to this style, making it hard to replace:

  • informative errors on unexpected data structure ("missing keyword: text")
  • ability to provide default values
  • clear separation of declaration "what this block expects" / "what it does with expected data", especially valuable in data processing pipelines

One may argue that in some Big Hairy Very Architectured Application you should instead wrap everything in objects/extract every processing step into method or service/extract validation as a separate concern etc... But in smaller utility scripts, or deep inside of complicated algorithmic libraries, the ability to write short and clear code with explicitly declared and controlled by language arguments is pretty valuable.

This style has no clean alternative, all possible alternatives are either less powerful or much less readable. Compare:

# Try to rewrite this:
words.map { |text:, paragraph_id:, timestamp: 0, is_punctuation: false|
  log.info "Processing #{timestamp / 60} minute"
  full_text = is_punctiation ? text : text + ' '
  "<span class='word paragraph-#{paragraph_id}' data-time=#{timestamp} data-original-text=#{text}>#{full_text}</span>"
}

# Alternative with just hashes:
words.map { |word|
  # those two used several times
  text = word.fetch(:text)
  timestamp = word.fetch(:timestamp, 0)
  log.info "Processing #{timestamp / 60} minute"
  # Absent is_punctuation is ok, it default to false
  full_text = word[:is_punctiation] ? text : text + ' '
  "<span class='word paragraph-#{word.fetch(:paragraph_id)}' data-time=#{timestamp} data-original-text=#{text}>#{full_text}</span>"
}

# Alternative with pattern-matching: to unpack variables, and handle default values, it will be something like...
case word
in text:, paragraph_id:, timestamp:
  # skip, just unpacked
in text:, paragraph_id: # no timestamp:
  timestamp = 0
end
# I am even not trying to handle TWO default values

As shown above, Hash#fetch/Hash#[] style makes it much harder to understand what block expects hash to have, and how it uses hash components — and just makes the code longer and less pleasant to write and read. Pattern-matching (at least for now) is just not powerful enough for this particular case (it also has non-informative error messages, but it obviously can be improved).

My proposal is to allow implicit hash unpacking into keyword arguments in non-lambda procs. It would be consistent with implicit array unpacking, which is an important property of non-lambda procs, useful for reasons very similar to described above.

Also available in: Atom PDF