Project

General

Profile

Actions

Feature #16897

open

General purpose memoizer in Ruby 3 with Ruby 2 performance

Added by sam.saffron (Sam Saffron) over 4 years ago. Updated over 4 years ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:98396]

Description

require 'benchmark/ips'

module Memoizer
def memoize_26(method_name)
  cache = {}

  uncached = "#{method_name}_without_cache"
  alias_method uncached, method_name

  define_method(method_name) do |*arguments|
    found = true
    data = cache.fetch(arguments) { found = false }
    unless found
      cache[arguments] = data = public_send(uncached, *arguments)
    end
    data
  end
end

  def memoize_27(method_name)
    cache = {}

    uncached = "#{method_name}_without_cache"
    alias_method uncached, method_name

    define_method(method_name) do |*args, **kwargs|
      found = true
      all_args = [args, kwargs]
      data = cache.fetch(all_args) { found = false }
      unless found
        cache[all_args] = data = public_send(uncached, *args, **kwargs)
      end
      data
    end
  end

  def memoize_27_v2(method_name)
    uncached = "#{method_name}_without_cache"
    alias_method uncached, method_name

    cache = "MEMOIZE_#{method_name}"

    params = instance_method(method_name).parameters
    has_kwargs = params.any? {|t, name| "#{t}".start_with? "key"}
    has_args = params.any? {|t, name| !"#{t}".start_with? "key"}

    args = []

    args << "args" if has_args
    args << "kwargs" if has_kwargs

    args_text = args.map do |n|
      n == "args" ? "*args" : "**kwargs"
    end.join(",")

    class_eval <<~RUBY
      #{cache} = {}
      def #{method_name}(#{args_text})
        found = true
        all_args = #{args.length === 2 ? "[args, kwargs]" : args[0]}
        data = #{cache}.fetch(all_args) { found = false }
        unless found
          #{cache}[all_args] = data = public_send(:#{uncached} #{args.empty? ? "" : ", #{args_text}"})
        end
        data
      end
    RUBY

  end

end

module Methods
  def args_only(a, b)
    sleep 0.1
    "#{a} #{b}"
  end

  def kwargs_only(a:, b: nil)
    sleep 0.1
    "#{a} #{b}"
  end

  def args_and_kwargs(a, b:)
    sleep 0.1
    "#{a} #{b}"
  end
end

class OldMethod
  extend Memoizer
  include Methods

  memoize_26 :args_and_kwargs
  memoize_26 :args_only
  memoize_26 :kwargs_only
end

class NewMethod
  extend Memoizer
  include Methods

  memoize_27 :args_and_kwargs
  memoize_27 :args_only
  memoize_27 :kwargs_only
end

class OptimizedMethod
  extend Memoizer
  include Methods

  memoize_27_v2 :args_and_kwargs
  memoize_27_v2 :args_only
  memoize_27_v2 :kwargs_only
end

OptimizedMethod.new.args_only(1,2)


methods = [
  OldMethod.new,
  NewMethod.new,
  OptimizedMethod.new
]

Benchmark.ips do |x|
  x.warmup = 1
  x.time = 2

  methods.each do |m|
    x.report("#{m.class} args only") do |times|
      while times > 0
        m.args_only(10, b: 10)
        times -= 1
      end
    end

    x.report("#{m.class} kwargs only") do |times|
      while times > 0
        m.kwargs_only(a: 10, b: 10)
        times -= 1
      end
    end

    x.report("#{m.class} args and kwargs") do |times|
      while times > 0
        m.args_and_kwargs(10, b: 10)
        times -= 1
      end
    end
  end

  x.compare!
end


# # Ruby 2.6.5
# #
# OptimizedMethod args only:   974266.9 i/s
#  OldMethod args only:   949344.9 i/s - 1.03x  slower
# OldMethod args and kwargs:   945951.5 i/s - 1.03x  slower
# OptimizedMethod kwargs only:   939160.2 i/s - 1.04x  slower
# OldMethod kwargs only:   868229.3 i/s - 1.12x  slower
# OptimizedMethod args and kwargs:   751797.0 i/s - 1.30x  slower
#  NewMethod args only:   730594.4 i/s - 1.33x  slower
# NewMethod args and kwargs:   727300.5 i/s - 1.34x  slower
# NewMethod kwargs only:   665003.8 i/s - 1.47x  slower
#
# #
# # Ruby 2.7.1
#
# OptimizedMethod kwargs only:  1021707.6 i/s
# OptimizedMethod args only:   955694.6 i/s - 1.07x  (± 0.00) slower
# OldMethod args and kwargs:   940911.3 i/s - 1.09x  (± 0.00) slower
#  OldMethod args only:   930446.1 i/s - 1.10x  (± 0.00) slower
# OldMethod kwargs only:   858238.5 i/s - 1.19x  (± 0.00) slower
# OptimizedMethod args and kwargs:   773773.5 i/s - 1.32x  (± 0.00) slower
# NewMethod args and kwargs:   772653.3 i/s - 1.32x  (± 0.00) slower
#  NewMethod args only:   771253.2 i/s - 1.32x  (± 0.00) slower
# NewMethod kwargs only:   700604.1 i/s - 1.46x  (± 0.00) slower

The bottom line is that a generic delegator often needs to make use of all the arguments provided to a method.

def count(*args, **kwargs)
  counter[[args, kwargs]] += 1
  orig_count(*args, **kwargs)
end

The old pattern meant we could get away with one less array allocation per:

def count(*args)
  counter[args] += 1
  orig_count(*args, **kwargs)
end

I would like to propose some changes to Ruby 3 to allow to recover this performance.

Perhaps:

def count(...)
  args = ...
  counter[args] += 1
  orig_count(...)
end

Or:

def count(***args)

  counter[args] += 1
  orig_count(***args)
end

Thoughts?


Related issues 3 (2 open1 closed)

Related to Ruby master - Misc #16188: What are the performance implications of the new keyword arguments in 2.7 and 3.0?Assignedjeremyevans0 (Jeremy Evans)Actions
Related to Ruby master - Misc #16157: What is the correct and *portable* way to do generic delegation?OpenActions
Related to Ruby master - Feature #16463: Fixing *args-delegation in Ruby 2.7: ruby2_keywords semantics by default in 2.7.1Closedmatz (Yukihiro Matsumoto)Actions

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago

You should be able to have similar performance by using ruby2_keywords and keeping the method definition the same. Have you tried that?

Note that with such an approach, you would not be able to differentiate between providing keywords and providing a positional hash, and potentially the memoized results could be different if keywords are fully separated (implemented in master, but possibly will be reverted before Ruby 3).

Updated by sam.saffron (Sam Saffron) over 4 years ago

Yes I can get this to work with hacks like this:

def memoize_26(method_name)
  cache = {}

  uncached = "#{method_name}_without_cache"
  alias_method uncached, method_name

  m = (define_method(method_name) do |*arguments|
    found = true
    data = cache.fetch(arguments) { found = false }
    unless found
      cache[arguments] = data = public_send(uncached, *arguments)
    end
    data
  end)

  if Module.respond_to?(:ruby2_keywords, true)
    ruby2_keywords(m)
  end
end


It is kind of ugly but certainly works for cases like this. My concern though is more around the future "best practices"

Can you expand on how full separation will break a cache if we implement either of my proposals?

def x(...)
  args = ...
end

My assumption here is that this special delegator extractor would have enough sugar to allow for this problem. Maybe this thing is not an Array but a subclass or special class that supports to_a

x(a: 1)

args.to_a == [{a: 1}]
args.has_args? == false
args.has_kwargs? == true


x(1)
args.to_a == [1]
args.has_args? == true
args.has_kwargs? == false


x(1, {a: 1}, a: 1, b: 1)
args.to_a === [1, {a: 1}, {a: 1, b: 1}]
args.has_args? == true
args.has_kwargs? == true
Actions #3

Updated by Eregon (Benoit Daloze) over 4 years ago

  • Description updated (diff)

Updated by sam.saffron (Sam Saffron) over 4 years ago

Expanded proposal:

def foo(...args)
   bar(...)
   args
end

args = foo(1, 2, {1 => 2}, a: 7, b: 9)

args.class
=> Arguments

args.length
=> 5

args[2] 
=> {1 => 2}

args[:x]
=> nil

args[:b]
=> 9

args.kwargs
=> {a: 7, b: 9}


args.args
=> [1, 2, {1 => 2}]

args.to_a
=> [1, 2, {1 => 2}, {a: 7}, {b: 9}]

args.each do |value, key|
  # key is nil for no kwargs
end


args2 = foo(1, 2, {1 => 2}, a: 7, b: 9)

args == args2
=> true

Updated by sam.saffron (Sam Saffron) over 4 years ago

Alternative proposal


def bar(*args, **kwargs)
end

def foo(*args)
   bar(*args)
   args
end

args = foo(1, 2, {1 => 2}, a: 7, b: 9)

args.class
=> Arguments

args.length
=> 5

args[2] 
=> {1 => 2}

args[:x]
=> nil

args[:b]
=> 9

args.kwargs
=> {a: 7, b: 9}


args.args
=> [1, 2, {1 => 2}]

args.to_a
=> [1, 2, {1 => 2}, {a: 7, b: 9}] # for backwards compat

args.each do |value, key|
  # key is nil for no kwargs
end


args2 = foo(1, 2, {1 => 2}, a: 7, b: 9)

args == args2
=> true

args3 = foo(1, 2, {1 => 2}, {a: 7, b: 9})

args == args3
=> false

Updated by ioquatix (Samuel Williams) over 4 years ago

I know this is a bit of a hack but... it's possible to get some kind of key from ... args:

KEY = ->(*arguments, **options) {[arguments, options]}

CACHE = {}

def memoize(name)
	method = self.method(name)
	self.define_method(name)
end

def fibonacci_original(x)
	if x < 2
		x
	else
		fibonacci(x-1) + fibonacci(x-2)
	end
end

def fibonacci(...)
	CACHE[KEY.call(...)] ||= fibonacci_original(...)
end

puts fibonacci(32)
pp CACHE

Updated by Eregon (Benoit Daloze) over 4 years ago

Could you clarify which of the numbers above you think is too expensive for performance?
Looking at the numbers I don't see a big difference but it's also hard to compare because they are not in the same order.

Updated by Eregon (Benoit Daloze) over 4 years ago

FWIW this works :

H = Hash.new { |h,k| h[k] = k ** 2 }
def sq(...); H.[](...); end
sq(44) # => 1936
sq(42) # => 1764
H # => {44=>1936, 42=>1764}

Can such an approach be used for your case?

Updated by sam.saffron (Sam Saffron) over 4 years ago

@Eregon (Benoit Daloze): to summarize the one point of performance that I want to address here

Memoizing a method that has both kwargs and args:

(1) Using the fastest pattern available on Ruby 2.7.1 REQUIRING usage of ruby2_keywords

OldMethod args and kwargs: 944008.6 i/s

(2) Using a standard pattern on 2.7.1, simply adding **kwargs

NewMethod args and kwargs: 766935.9 i/s

(3) Using a heavily optimized and verbose approach:

OptimizedMethod args and kwargs: 771978.2 i/s


I tried this which is a hybrid of your and Samuels suggestion:

module Memoizer
  def self.KEY(*args, **kwargs)
    [args, kwargs]
  end
  
  def memoize(method_name)
    cache = "MEMOIZE2_#{method_name}"

    uncached = "#{method_name}_without_cache"
    alias_method uncached, method_name

    class_eval <<~RUBY
      #{cache} = {}
      def #{method_name}(...)
        found = true
        args = Memoizer.KEY(...)
        data = #{cache}.fetch(args) { found = false }
        unless found
          #{cache}[args] = data = #{uncached}(...)
        end
        data
      end
    RUBY
  end
end

Sadly it appears to be slowest: 696435.9 i/s

I can not seem to dispatch ... directly into fetch for arbitrary arguments:

class Foo
  HASH = {}
  def bar(...)
    HASH.fetch(...) { rand }
  end
end

foo = Foo.new

puts foo.bar(1)

/home/sam/Source/performance/memoize/memoize.rb:8: both block arg and actual block given

Memoizer needs fetch cause you may be memoizing nil.

@jeremyevans0 (Jeremy Evans) would though argue that this is the only correct generic memoizer, but as implemented on 2.7.1 ... is implemented in a buggy way:

class Foo
  def key(*args, **kwargs)
    {args: args, kwargs: kwargs}
  end

  def test(...)
    key(...)
  end
end

puts Foo.new.test({a: 1})

/home/sam/Source/performance/memoize/memoize.rb:11: warning: Using the last argument as keyword parameters is deprecated; maybe ** should be added to the call
/home/sam/Source/performance/memoize/memoize.rb:6: warning: The called method `key' is defined here

{:args=>[], :kwargs=>{:a=>1}}

puts Foo.new.test(a: 1)

{:args=>[], :kwargs=>{:a=>1}}

I am not following what the reticence here is for introducing an Arguments proper object, it solves this performance very cleanly. Plus it lets us dispatch around a list of arguments cleanly maintaining args / kwargs separation without introducing a new class per library to communicate this concept.

def foo(a = 1, b: 2)
  puts "a: #{a} b: #{b}"
end

def delay(...x)
  if @delayed
    foo(...@delayed)
  end
  @delayed = x
end

delay({b: 7})
# prints nothing

delay(9999)
"a: {b: 7} b: 2"
 

To be entirely honest I prefer breaking *args and having it be of class Arguments vs Array, cause now it is a trap we are leaving forever, given it captures kwargs now anyway.

# my pref ...

def foo(a = 1, b: 2)
  puts "a: #{a} b: #{b}"
end

def delay(*x)
  # @delayed.class = Arguments
  if @delayed
    foo(*@delayed)
  end
  @delayed = x
end

delay({b: 7})
# prints nothing

delay(9999)
"a: {b: 7} b: 2"
 

cc @ko1 (Koichi Sasada) , @mame (Yusuke Endoh)

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago

I'm able to get a correct 2.7 general purpose memoization method within 6% of the basic ruby2_keywords approach. By correct, I mean it handles a positional hash differently than keywords.

Code:

require 'benchmark/ips'

class A
  def self.memoize_26(method_name)
    cache = {}

    uncached = "#{method_name}_without_cache"
    alias_method uncached, method_name

    m = (define_method(method_name) do |*arguments|
      found = true
      data = cache.fetch(arguments) { found = false }
      unless found
        cache[arguments] = data = public_send(uncached, *arguments)
      end
      data
    end)

    if Module.respond_to?(:ruby2_keywords, true)
      ruby2_keywords(m)
    end
  end

  def self.memoize_26o(method_name)
    cache = {}

    uncached = "#{method_name}_without_cache"
    alias_method uncached, method_name

    m = (define_method(method_name) do |*arguments|
      cache.fetch(arguments) { cache[arguments] = send(uncached, *arguments) }
    end)

    if Module.respond_to?(:ruby2_keywords, true)
      ruby2_keywords(m)
    end
  end

  def self.memoize_27(method_name)
    uncached = "#{method_name}_without_cache"
    alias_method uncached, method_name

    cache = {}
    kache = {}
    ruby2_keywords(define_method(method_name) do |*arguments|
      last_arg = arguments.last
      if Hash === last_arg && Hash.ruby2_keywords_hash?(last_arg)
        kache.fetch(arguments) { kache[arguments] = send(uncached, *arguments) }
      else
        cache.fetch(arguments) { cache[arguments] = send(uncached, *arguments) }
      end
    end)
  end

  memoize_26 def a26(*args)
    args
  end
  memoize_26 def kw26(*args, **kw)
    [args, kw]
  end
  class_eval "def x26; #{'a26(1); a26({k: 1}); a26(k: 1); kw26(1); kw26(1, k: 1); kw26(k: 1);'*100} end"

  memoize_26o def a26o(*args)
    args
  end
  memoize_26o def kw26o(*args, **kw)
    [args, kw]
  end
  class_eval "def x26o; #{'a26o(1); a26o({k: 1}); a26o(k: 1); kw26o(1); kw26o(1, k: 1); kw26o(k: 1);'*100} end"

  memoize_27 def a27(*args)
    args
  end
  memoize_27 def kw27(*args, **kw)
    [args, kw]
  end
  class_eval "def x27; #{'a27(1); a27({k: 1}); a27(k: 1); kw27(1); kw27(1, k: 1); kw27(k: 1);'*100} end"
end

return unless __FILE__ == $0

a = A.new

Benchmark.ips do |x|
  x.report("26"){a.x26}
  x.report("26o"){a.x26o}
  x.report("27"){a.x27}
  x.compare!
end

Output:

Calculating -------------------------------------
                  26    376.379  (_ 0.3%) i/s -      1.887k in   5.013620s
                 26o    380.271  (_ 0.3%) i/s -      1.938k in   5.096377s
                  27    358.914  (_ 0.6%) i/s -      1.820k in   5.071030s

Comparison:
                 26o:      380.3 i/s
                  26:      376.4 i/s - 1.01x  (_ 0.00) slower
                  27:      358.9 i/s - 1.06x  (_ 0.00) slower

I think a 6% difference in a microbenchmark is close enough that we need not consider changes to the language or interpreter. Especially considering we can probably get the difference less than 6% if #16697 is accepted.

Updated by sam.saffron (Sam Saffron) over 4 years ago

That is relying on Hash.ruby2_keywords_hash? surely a long term solution for Ruby 3 can not rely on that?

At a minimum if this is the direction Ruby is taking (not expanding behavior of ... and not fixing *args to be of type Arguments) there should be a clean interface without the word ruby2 in it.

def foo(*args)
  puts args.kwargs?
end

foo(a: 1)
=> true

foo({a: 1})
=> false

My concern here is long term, what is the long term solution to this problem without using various ruby2_XYZ methods?

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago

sam.saffron (Sam Saffron) wrote in #note-11:

That is relying on Hash.ruby2_keywords_hash? surely a long term solution for Ruby 3 can not rely on that?

Hash.ruby2_keywords_hash? will exist as long as ruby2_keywords does. Personally, I don't think ruby2_keywords should be removed until an approach with equal or better performance is found.

At a minimum if this is the direction Ruby is taking (not expanding behavior of ... and not fixing *args to be of type Arguments) there should be a clean interface without the word ruby2 in it.

Expanding behavior of ... to support leading arguments has already been approved by matz and should be part of Ruby 3. See #16378.

FWIW, the original name of ruby2_keywords was pass_keywords. It was only renamed to ruby2_keywords precisely because the idea was it would be removed.

def foo(*args)
  puts args.kwargs?
end

foo(a: 1)
=> true

foo({a: 1})
=> false

Already possible, assuming ruby_keywords :foo:

class Array
  def kwargs?
    Hash === last && hash.ruby2_keywords_hash?(last)    
  end
end

My concern here is long term, what is the long term solution to this problem without using various ruby2_XYZ methods?

In Ruby 3, you can use (*args, **kw) delegation and it will work in all cases. It may not perform as well, though I have optimized **kw when calling so that it never allocates more than 1 hash, and in some cases can be allocation less. See d2c41b1bff1f3102544bb0d03d4e82356d034d33. You should try benchmarking on the master branch if you are concerned about long term solutions and not about 2.7.

Updated by sam.saffron (Sam Saffron) over 4 years ago

@jeremyevans0 (Jeremy Evans) yeah I can confirm ... delegation works as expected and correctly in master.

Unfortunately the problem in the OP remains (running the same benchmarks)

Fastest solution for memoizing a method with args and kwargs on master is 936016.3 i/s using ruby2_XYZ hacks

Fastest solution for memoizing a method with args and kwargs on master without ruby2_hacks is messy and 773916.4 i/s

To be clear, I am not concerned about ruby 2.7 here, I just want a clean and blessed way of memoizing a general method in Ruby 3 that does not involve hacks and hoping we can have it for Ruby 3 release.

Exposing the value of ... as a class called Arguments would give us the performance and correctness we are after. Arguments is 1 RVALUE, Array is 1 RVALUE.

def count_calls(***a)
   counter[a] += 1
   countee(***a)
end

...a is not going to work cause countee(...a) will have the param confused with a Range.

I also think this can work:

def count_calls(...)
   a = ...
   counter[a] += 1
   countee(...)
end

especially if this works correctly cause we implement splatting of Arguments to understand kwargs:

def count_calls(...)
   a = ...
   counter[a] += 1
   countee(*a)
end

cc @mame (Yusuke Endoh)

Actions #14

Updated by Eregon (Benoit Daloze) over 4 years ago

  • Related to Misc #16188: What are the performance implications of the new keyword arguments in 2.7 and 3.0? added
Actions #15

Updated by Eregon (Benoit Daloze) over 4 years ago

  • Related to Misc #16157: What is the correct and *portable* way to do generic delegation? added
Actions #16

Updated by Eregon (Benoit Daloze) over 4 years ago

  • Related to Feature #16463: Fixing *args-delegation in Ruby 2.7: ruby2_keywords semantics by default in 2.7.1 added

Updated by Eregon (Benoit Daloze) over 4 years ago

Thanks for the summary.
I agree ruby2_* cannot be a long-term solution.
And IMHO ruby2_* should be removed as soon as possible for migration, because it affects performance of all *rest calls and makes semantics more complicated (*args might magically pass kwargs even in Ruby 3+).

I am not following what the reticence here is for introducing an Arguments proper object,

I don't have a full opinion on that yet, but it seems to me that only increases or keep the same number of allocations as *args, **kwargs.
I.e., there will still be a Hash to hold the keyword arguments (because they "escape" into the arguments object, not passed directly to another call),
and I think the Hash allocation is a good part of what's slow currently in CRuby.

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago

As @Eregon (Benoit Daloze) mentioned, the ***a approach is likely to be the same speed or slower than *args, **kw approach on CRuby as it has to allocate at least as many objects. It could be theoretically possible to increase performance by reducing allocations in the following cases:

  • 0-3 arguments with no keywords
  • 0-1 arguments with 1 keyword

You could do this by storing the arguments inside the object, similar to how arrays and hashes are optimized internally. Those cases would allow you to get by with a single object allocation instead of allocating two objects (array+hash), assuming that the caller side is not doing any object allocation. All other cases would be as slow or slower.

This approach would only be faster if you never needed to access the arguments or keywords passed. As soon as you need access to the arguments or keywords, it would likely be slower as it would have to allocate an array or hash for them. This limits the usefulness of the approach to specific cases.

When you compare ***a to ruby2_keywords, which is currently the fastest approach, the cases where it could theoretically be faster I believe are limited to 0-1 arguments with 1 keyword.

This approach will increase complexity in an already complex system. It would be significant undertaking to implement, and it's not clear it would provide a net performance improvement.

It is true that supporting ruby2_keywords makes *args calls without keywords slower. I think the maximum slowdown was around 10%, and that was when the callee did not accept a splat or keywords. When the callee accepted a splat or keywords, I think the slowdown was around 1%. However, as ruby2_keywords greatly speeds up delegation (see below), ruby2_keywords results in a net increase in performance in the majority of cases. Until ruby2_keywords no longer results in a net increase in performance in the majority of cases, I believe it should stay.

Here's a benchmark showing a 160% improvement in delegation performance in master by using ruby2_keywords instead of *args, **kw:

def m1(arg) end
def m2(*args) end
def m3(arg, k: 1) end
def m4(*args, k: 1) end
def m5(arg, **kw) end
def m6(*args, **kw) end

ruby2_keywords def d1(*args)
  m2(*args);m2(*args);m2(*args);m2(*args);m2(*args);
  m3(*args);m3(*args);m3(*args);m3(*args);m3(*args);
  m4(*args);m4(*args);m4(*args);m4(*args);m4(*args);
  m5(*args);m5(*args);m5(*args);m5(*args);m5(*args);
  m6(*args);m6(*args);m6(*args);m6(*args);m6(*args);
end
ruby2_keywords def d1a(*args)
  m1(*args);m1(*args);m1(*args);m1(*args);m1(*args);
end

def d2(*args, **kw)
  m2(*args, **kw);m2(*args, **kw);m2(*args, **kw);m2(*args, **kw);m2(*args, **kw);
  m3(*args, **kw);m3(*args, **kw);m3(*args, **kw);m3(*args, **kw);m3(*args, **kw);
  m4(*args, **kw);m4(*args, **kw);m4(*args, **kw);m4(*args, **kw);m4(*args, **kw);
  m5(*args, **kw);m5(*args, **kw);m5(*args, **kw);m5(*args, **kw);m5(*args, **kw);
  m6(*args, **kw);m6(*args, **kw);m6(*args, **kw);m6(*args, **kw);m6(*args, **kw);
end
def d2a(*args, **kw)
  m1(*args, **kw);m1(*args, **kw);m1(*args, **kw);m1(*args, **kw);m1(*args, **kw);
end

require 'benchmark'

print "ruby2_keywords: "
puts(Benchmark.measure do
  100000.times do
    d1a(1)
    d1(1, k: 1)
  end
end)

print "   *args, **kw: "
puts(Benchmark.measure do
  100000.times do
    d2a(1)
    d2(1, k: 1)
  end
end)

Results:

ruby2_keywords:   1.350000   0.000000   1.350000 (  1.395517)
   *args, **kw:   3.630000   0.000000   3.630000 (  3.693702)
Actions #19

Updated by sawa (Tsuyoshi Sawada) over 4 years ago

  • Subject changed from Can a Ruby 3.0 compatible general purpose memoizer be written in such a way that it matches Ruby 2 performance? to General purpose memoizer in Ruby 3 with Ruby 2 performance

Updated by Eregon (Benoit Daloze) over 4 years ago

I added this ticket to the next dev-meeting agenda: #16933

Updated by shevegen (Robert A. Heiler) over 4 years ago

Matz makes the final decisions, but I would like to point out that adding ***
all of a sudden may be confusing:

  • Ruby users may have to distinguish between *, **, and ***. And ..., too.

I of course can not speak for newcomers to ruby, since I am not a newcomer
myself, but I am somewhat worried that there is an increase in syntax
complexity with marginal gains at best here. The other syntax elements,
that is *, ** and ... are already available in ruby. I think it may be
best to wait whether matz considers *** acceptable or not before going
on that path without knowing whether it is.

Updated by sam.saffron (Sam Saffron) over 4 years ago

I started reading through the code and it is certainly tricky, I wonder if we simply make ruby2_keywords_hash? official and long term aka rename it to keywords_hash?, maybe even allow for {a: 1}.keywords_hash! to set the flag.

From what I can tell we can't do RARRAY(array)->basic.flags ... we do not have a place for another flag on an array.

But ... if we made this pattern official we could certainly have this code just work in Ruby 3, force a check in c code for the flag on the last param:

def bar(a:)
  puts a
end

def foo(*x)
  bar(*x)
end

foo(a: 1)

Updated by ko1 (Koichi Sasada) over 4 years ago

Quoted from https://bugs.ruby-lang.org/issues/16933

[Feature #16897] General purpose memoizer in Ruby 3 with Ruby 2 performance (eregon)

Thoughts about the ***args / Arguments proposal?
Ideas to optimize delegation with *args, **kwargs better, which it seems most people expect (it's intuitive, same as Python) is the Ruby 3 way to do delegation?
Some other ideas to optimize memoization, maybe using ...?

IMO Arguments object should be discussed for usability first, not performance first.
To discuss the usability, the Arguments class interface is needed.

mame-san summarized possible interface:

  • Arguments class interface face
    • getter
      • Arguments#positionals #=> an Array of positional args
      • Arguments#keywords #=> a Hash of keys and values
      • Arguments#block #=> a Proc of block
      • Arguments#block_given? #=> true/false
      • Arguments#hash, Arguments#eql? (for memoization)
      • Arguments#to_a or to_ary is a bad idea
      • Arguments#[] for shorthand of Arguments#positionals[n]
    • setter/modifier (needed?)
      • Arguments#prepend and append?
      • Arguments#add_key and remove_key?
      • Arguments#strip_block and add_block?
    • Arguments.new?
    • Destrictive versions of the operations
    • Marshal?
    • Binding#arguments? # ko1: it is difficult to support because of performance
    • super with an Arguments?

And we need to consider the syntax to handle it.
Now we are discussing keyword arguments and ... arguments. We can consider with them.

Maybe it should be separated ticket...

Updated by sam.saffron (Sam Saffron) over 4 years ago

Arguments#[] for shorthand of Arguments#positionals[n]

I know it is not a "purely" clean interface. But I like:

def foo(*args)
  puts args[0]
  # 1
  puts args[1]
  # 10
  puts args[:foo]
  # 10
  puts args[:bar]
  # raises an exception, or nil if we do not want to explode
end

foo(1, foo: 10)

I would be more than happy to split off a ticket for the new interface, shall I create it or maybe name-san can?

Updated by mame (Yusuke Endoh) over 4 years ago

I talked with matz and committers about my rough idea yesterday, but matz didn't like it.

In terms of usability, matz seems to like handling *args, **kwargs because it is explicit and not so complex.
In terms of performance, "just slower than ruby2" seems not convincing to matz. If the performance degradation is a bottleneck in a real-world application, he may change his mind.

Updated by sam.saffron (Sam Saffron) over 4 years ago

In terms of usability, matz seems to like handling *args, **kwargs because it is explicit and not so complex.

To me the design we arrived at is very very non-intuitive sadly, @matz (Yukihiro Matsumoto)

def bar(a: 1)
end

def foo(*x)
  puts x
  bar(*x)
end

*x captures both kwargs and args, yet is is not allowed to pass kwargs along.

foo(a: 1) will print {a: 1} and then error out. This is very bad usability in the language.

If there is strong insistence on separation we should do so properly.

foo(a: 1) should throw an exception cause it only captures args an not kwargs. At least that would guide people at the right direction.

My preference remains :

  1. Best... fix *x so it is able to delegate kwargs properly 100% of the time, foo(a: 1) should work, foo({a: 1}) should exception. This means we codify and make official {}.kwargs? or something like that.

  2. Second best is introducing ***x and Arguments

But the current status quo is a huge trap we are leaving for future Ruby generations.

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago

sam.saffron (Sam Saffron) wrote in #note-26:

In terms of usability, matz seems to like handling *args, **kwargs because it is explicit and not so complex.

To me the design we arrived at is very very non-intuitive sadly, @matz (Yukihiro Matsumoto)

def bar(a: 1)
end

def foo(*x)
  puts x
  bar(*x)
end

*x captures both kwargs and args, yet is is not allowed to pass kwargs along.

foo(a: 1) will print {a: 1} and then error out. This is very bad usability in the language.

The alternative is breaking compatibility with code that never used keyword arguments. See #14183 for the quite long discussion regarding this.

foo(a: 1) should throw an exception cause it only captures args an not kwargs. At least that would guide people at the right direction.

This breaks backwards compatibility. If people think migrating is hard with the changes we made in 2.7/3.0, I think they don't really understand how much harder it would have been if we stopped caller-side keywords from being converted to positional hashes (which dates back at least to Ruby 1.6, and probably earlier).

My preference remains :

  1. Best... fix *x so it is able to delegate kwargs properly 100% of the time, foo(a: 1) should work, foo({a: 1}) should exception. This means we codify and make official {}.kwargs? or something like that.

You can already get that behavior if you want, using def foo(*args, **nil). That is new syntax added in Ruby 2.7, specifically for people that do not want keyword to positional argument conversion.

You can also implement automatic keyword passing by default if you don't want to call ruby2_keywords for each method using *args where you want to pass keywords implicitly:

class Module
  def method_added(method_name)
    meth = instance_method(method_name)
    return unless meth.source_location
    has_rest = false
    meth.parameters.each do |param|
      case param[0]
      when :key, :key_req, :keyrest, :no_key
        return
      when :rest
        has_rest = true
      end
    end

    if has_rest
      ruby2_keywords method_name
    end
  end
end

But the current status quo is a huge trap we are leaving for future Ruby generations.

I disagree. In my opinion, it's at most a minor issue. It's far better not to break tons of code now.

The breakage for methods that accept keywords was necessary to fix the issues with keywords. There is no need to change methods that do not accept keywords, since they didn't experience any of those issues.

Even if you consider the current design a mistake, in your example, with the master branch, you get an ArgumentError, which is straightforward to fix. I'm not sure why you consider that a "huge trap".

Updated by sam.saffron (Sam Saffron) over 4 years ago

Understood Jeremy, there are always compromises.

def bar(a:); end
def foo(*args); bar(*args); end; 

There was a deliberate decision made to break foo(a: 1) here (by default) which has both its upsides and downsides.

If we keep foo(a: 1) working a developer may get confused (despite everything operating 100% correctly):

Why does foo(a: 1) work and bar(*[{a: 1}]) not work?

I still think that wider compatibility is better, teaching the developer about this trick for the huge edge case imo is better than the alternative of dropping support for kwarg tracking in *args:

h = {a: 1}
h.kwargs! 
bar(*[h])

Advantage of long term kwarg tracking is wider compatibility and we can have 1 entity for all the arguments (a single Array).

Disadvantage is that there are "nooks" inside Hash that we need to carry long term and bar(**h) works anyway.

I stand by it being a trap, it is hard to explain why you can capture one thing but can not delegate it without hacks and traps. I also admit this is not the biggest issue in the world, but it is strange. I guess the problem is that the solution I am proposing (and that has been proposed before) is also strange, so it we are replacing strange with strange.

Updated by Dan0042 (Daniel DeLorme) over 4 years ago

sam.saffron (Sam Saffron) wrote in #note-28:

If we keep foo(a: 1) working a developer may get confused (despite everything operating 100% correctly):

Why does foo(a: 1) work and bar(*[{a: 1}]) not work?

I still think that wider compatibility is better, teaching the developer about this trick for the huge edge case imo is better than the alternative of dropping support for kwarg tracking in *args:

h = {a: 1}
h.kwargs! 
bar(*[h])

I feel compelled to make a quick note here that what Sam is saying above is very similar to what I proposed in #16511, with a few differences:

foo(a: 1)      #keyword 'a'
bar(*[{a: 1}]) #positional hash {a: 1}
bar(*[a: 1])   #keyword 'a'
h = {a: 1}
bar(*[**h])    #keyword 'a'

Although I disagree this has to be carried long term; mid term is good enough to facilitate migration greatly, and may even give enough time to come up with a good memoizer solution.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0