Project

General

Profile

Feature #15563

#dig that throws an exception if an key doesn't exist

Added by 3limin4t0r (Johan Wentholt) 11 months ago. Updated 12 days ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:91265]

Description

Ruby 2.3.0 introduced #dig for Array, Hash and Struct. Both Array and Hash have #fetch which does the same as #[], but instead of returning the default value an exception is raised (unless a second argument or block is given). Hash also has #fetch_values which does the same as #values_at, raising an exception if an key is missing. For #dig there is no such option.

My proposal is to add a method which does the same as #dig, but instead of using the #[] accessor it uses #fetch.

This method would look something like this:

module Traversable
  def traverse(key, *others)
    value = fetch(key)
    return value if others.empty?

    if value.respond_to?(__method__, true)
      value.send(__method__, *others)
    else
      raise TypeError, "#{value.class} does not have ##{__method__} method"
    end
  end
end

Array.include(Traversable)
Hash.include(Traversable)

The exception raised is taken from #dig ([1].dig(0, 1) #=> TypeError: Integer does not have #dig method).

yaml = YAML.load_file('some_file.yml')

# this change is meant to make chaining #fetch calls more easy
yaml.fetch('foo').fetch('bar').fetch('baz')

# would be replaced with
yaml.traverse('foo', 'bar', 'baz')

I'm curious to hear what you guys think about the idea as a whole and the method name.


Related issues

Is duplicate of Ruby master - Feature #12282: Hash#dig! for repeated applications of Hash#fetchOpenActions
Is duplicate of Ruby master - Feature #14602: Version of dig that raises error if a key is not presentOpenActions

History

Updated by 3limin4t0r (Johan Wentholt) 11 months ago

  • Description updated (diff)

I just discovered that #dig also call private methods. I updated the provided examples to do the same.

hash = { b: 'b' }
hash.singleton_class.send(:private, :dig)
{ a: hash }.dig(:a, :b)
#=> 'b'

Updated by shevegen (Robert A. Heiler) 11 months ago

I have no particular pro or con against the feature itself as such; I myself do not use or need .dig so I
can not speak much about it. But I believe one problem with the proposal here is the name.

I think a name such as "dig_e" would be very, very rare to see in ruby. Of course I have no idea how
matz thinks about it, but I would recommend to you to also consider alternative names; or perhaps
let it handle just through arguments, whatever may seem to fit better.

Short names are sometimes really, really great, such as p and pp; but I think one overall concern may
be to not lose too much of the meaning. Off the top of my head, I can only think of FileUtils having
odd/very short method names, and this is mostly because it sort of "simulates" how coreutils utilities
such as "mkdir -p" and similar work.

If you look at recent changes in ruby, you may notice the :exception key - :e would be shorter than
that too, but I think it may not be a primary goal at all times to be too overly succinct, so if that is
a valid reasoning then I think this may explain why :exception would be used, and no shorter
variant. A similar reasoning could apply to the case here - but again, ultimately you have to see what
matz thinks about it not how others may think about it. :)

Updated by jwmittag (Jörg W Mittag) 11 months ago

shevegen (Robert A. Heiler) wrote:

I have no particular pro or con against the feature itself as such; I myself do not use or need .dig so I
can not speak much about it. But I believe one problem with the proposal here is the name.

I think a name such as "dig_e" would be very, very rare to see in ruby. Of course I have no idea how
matz thinks about it, but I would recommend to you to also consider alternative names; or perhaps
let it handle just through arguments, whatever may seem to fit better.

There is a well-established convention in Ruby, when you have a pair of methods that does similar things in different ways, to name them foo and foo!. For example, select and select!, Process::exit and Process::exit!, and so on.

So, one possibility would be dig!.

Updated by matz (Yukihiro Matsumoto) 11 months ago

I am against dig! for this purpose. When we have two versions of a method (foo and foo!), the bang version should be more dangerous than the non-bang version. dig! is not the case.

And with whatever name, we need the real-world use-case for a new method. "We don't have fetch counterpart of dig" is not good enough.

Matz.

#5

Updated by k0kubun (Takashi Kokubun) 11 months ago

  • Is duplicate of Feature #12282: Hash#dig! for repeated applications of Hash#fetch added

Updated by k0kubun (Takashi Kokubun) 11 months ago

Personally I've hit a real-world use-case of this feature many times.

I often manage structured configs with nested YAML files and load them from Ruby. With current Ruby, to avoid an unhelpful exception NoMethodError, I assert the existence of the deep keys using a Hash#fetch chain like this:

config = YAML.load_file('config.yml')
config.fetch('production').fetch('environment').fetch('SECRET_KEY_BASE') #=> an exception like: KeyError: key not found: "SECRET_KEY_BASE"

If we had such a method, we would be able to easily write (let's say it's named Hash#deep_fetch instead of #dig!):

config.deep_fetch('production', 'environment', 'SECRET_KEY_BASE')

and the best part is that we could get a more helpful error message like "key not found: production.environment.SECRET_KEY_BASE" whose nested information isn't available with a Hash#fetch method chain.


By the way, if we had this, I would like to have a keyword argument default: like the second optional argument of Hash#fetch:

config.deep_fetch('production', 'environment', 'SECRET_KEY_BASE', default: '002bbfb0a35d0fd05b136ab6333dc459')

we want to safely manage the credentials only for production, so sometimes we don't want to manage credentials in (safely-managed originally-encrypted) YAML file for development environment and just want to return the unsafe thing as a default value.

(edit: s/fetch_keys/deep_fetch/, like I proposed 2 years ago)

Updated by 3limin4t0r (Johan Wentholt) 11 months ago

My scenario would be similar to k0kubuns scenario.

# The connection translates the request to JSON and parses the response
# from JSON into the correct objects. In this case a nested hash structure.
response = connection.send(request)

# assign shortcuts
report = response
         .fetch('Era.Common.NetworkMessage.ConsoleApi.Reports.RpcGenerateReportResponse')
         .fetch('report')

column_data   = report.fetch('data').fetch('columns')
column_labels = report.fetch('rendering').fetch('table').fetch('columns')

# build report
report_data = column_data.each_with_object({}) do |column, data|
  column_id       = column.fetch('header').fetch('column_id')
  data[column_id] = column.fetch('values')
end

report = column_labels.each_with_object({}) do |column, data|
  label       = column.fetch('label').fetch('literal')
  column_id   = column.fetch('column_id')
  data[label] = report_data.fetch(column_id)
end

From the above scenario you can see that having this new functionality would help clean things up.

The reason I use #fetch here is because the API to which I'm talking might change its structure. Getting an error as soon as possible reduces debug time. If #dig where used, nil would be returned when the structure is invalid. This would most of the time raise an exception somewhere else that then needs to be traced back to its source (the changed response structure).

My preference goes out to dropping the "returning nil if any intermediate step is nil" description (as described in point 2 of the feature proposal). Otherwise, when a key is present but the value is set to nil it will short circuit out of the method. Dropping this part of the #dig description would ensure the full path is traversed.

I also had a look at the linked feature proposal. I find the name #deep_fetch the most descriptive. #fetch_keys sounds like it will fetch multiple keys on a single hash (basically what #fetch_values does). #fetch_all suffers from the same problem. If the eventual version always traverses the full path (see point 2 of the feature proposal) #traverse could be an option.

Updated by walerian (Walerian Sobczak) 9 months ago

I would suggest #retrieve. It's just a stronger #fetch, and the dictionary definition reflects its meaning:

retrieve (verb)

  1. get or bring (something) back from somewhere
  2. find or extract

The name is still short and simple, but also idiomatic and meaningful at the same time.

config = YAML.load_file('config.yml')

# so instead of this:
config.fetch('production').fetch('environment').fetch('SECRET_KEY_BASE')

# we would have:
config.retrieve('production', 'environment', 'SECRET_KEY_BASE')
#9

Updated by k0kubun (Takashi Kokubun) 6 months ago

  • Is duplicate of Feature #14602: Version of dig that raises error if a key is not present added
#10

Updated by knu (Akinori MUSHA) 6 months ago

I thought adding an optional block to dig could be an idea.

obj.dig(*keys) { |dug_keys, rest_keys|
  # raise yourself or return something
}

But the costs required for extending the dig method as such in Array or Hash-like classes would be a bit too high.

Updated by 3limin4t0r (Johan Wentholt) 24 days ago

  • Description updated (diff)

Changed naming from #dig_e to #traverse. Removed alternative options, and kept the strict version since #fetch is also strict. Possibility to expand upon this with a block or :default keyword argument is possible, but should be its own feature request.

Updated by 3limin4t0r (Johan Wentholt) 24 days ago

  • Description updated (diff)

Removed the two discussion points in the previous edit. This edit adjusts the last sentence to reflect this.

Updated by amcaplan (Ariel Caplan) 12 days ago

matz (Yukihiro Matsumoto) wrote:

I am against dig! for this purpose. When we have two versions of a method (foo and foo!), the bang version should be more dangerous than the non-bang version. dig! is not the case.

And with whatever name, we need the real-world use-case for a new method. "We don't have fetch counterpart of dig" is not good enough.

Matz.

I think that I, along with others, have gotten used to bang methods being subject to more errors rather than mutating state, due to how ActiveSupport uses them. Ruby is not Rails, though, so this is a good reminder :)

I like #deep_fetch, which also happens to be a gem that already does this (https://rubygems.org/gems/deep_fetch). It was designed for implicitly validating that API formats have not changed, which says something about the need for this. I think having an easy way to dig while doing that constant validation would encourage writing code that notifies us right away when our assumptions are wrong.

As an alternative, maybe #fetch_strict or (though I like this less) a named parameter passed to dig like my_array.dig(:foo, :bar, :baz, strict: true). The latter option currently isn't possible since it's unclear whether strict: true is a key or a named parameter, but that problem will go away in Ruby 3.

Also available in: Atom PDF