Feature #18285
openNoMethodError#message uses a lot of CPU/is really expensive to call
Description
Hello there! I'm working at Datadog on the ddtrace gem -- https://github.com/DataDog/dd-trace-rb and we ran into this issue on one of our internal testing applications. I also blogged about this issue in https://ivoanjo.me/blog/2021/11/01/nomethoderror-ruby-cost/.
Background¶
While testing an application that threw a lot of NoMethodError
s in a Rails controller (this was used for validation), we discovered that service performance was very much impacted when we were logging these exceptions. While investigating with a profiler, the performance impact was caused by calls to NoMethodError#message
, because this Rails controller had a quite complex #inspect
method, that was getting called every time we tried to get the #message
from the exception.
How to reproduce¶
require 'bundler/inline'
gemfile do
source 'https://rubygems.org'
gem 'benchmark-ips'
end
puts RUBY_DESCRIPTION
class GemInformation
# ...
def get_no_method_error
method_does_not_exist
rescue => e
e
end
def get_runtime_error
raise 'Another Error'
rescue => e
e
end
def inspect # <-- expensive method gets called when calling NoMethodError#message
Gem::Specification._all.inspect
end
end
NO_METHOD_ERROR_INSTANCE = GemInformation.new.get_no_method_error
RUNTIME_ERROR_INSTANCE = GemInformation.new.get_runtime_error
Benchmark.ips do |x|
x.config(:time => 5, :warmup => 2)
x.report("no method error message cost") { NO_METHOD_ERROR_INSTANCE.message }
x.report("runtime error message cost") { RUNTIME_ERROR_INSTANCE.message }
x.compare!
end
Expectation and result¶
Getting the #message
from a NoMethodError
should be no costly than getting it from any other exception.
In reality:
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
no method error message cost
115.390 (± 1.7%) i/s - 580.000 in 5.027822s
runtime error message cost
6.938M (± 0.5%) i/s - 35.334M in 5.092617s
Comparison:
runtime error message cost: 6938381.6 i/s
no method error message cost: 115.4 i/s - 60130.02x (± 0.00) slower
Suggested solutions¶
- Do not call
#inspect
on the object on which the method was not found (see https://github.com/ruby/ruby/blob/e0915ba67964d843832148aeca29a1f8244ca7b1/error.c#L1962) - Cache result of calling
#message
after the first call. Ideally this should be done together with suggestion 1.
Updated by Eregon (Benoit Daloze) 7 months ago
I think we've had this report multiple times, it's a trade-off between efficiency and useful information being shown.
Ideally #inspect
wouldn't be so expensive.
NoMethodError is typically because the method isn't defined on the class of that instance, and very rarely because it's not defined on the singleton class of that instance (except if it's a Module/Class).
Hence showing data of that specific instance seems not so valuable, might as well just show the class.
So I think one good fix here would be to use the basic Kernel#to_s to format the receiver, i.e., rb_any_to_s() and with special handling for true/false/nil, and also for modules (show the module name).
Then we wouldn't lose too much information from the exception (the class of the object would still be obvious) and avoid the drawbacks.
Updated by ivoanjo (Ivo Anjo) 7 months ago
Yeah, I think it's useful to include the object class and possibly its id, but with the default #inspect
behavior of "print the object and all of its instance variables, and do it recursively", it's a rather sharp edge to have it being called when getting the exception message here.
Updated by byroot (Jean Boussier) 7 months ago
If applicable, I suggest to do what we do in Rails: https://github.com/rails/rails/blob/f95c0b7e96eb36bc3efc0c5beffbb9e84ea664e4/activesupport/lib/active_support/core_ext/name_error.rb#L13-L18
When dealing with NameError
and its subclasses, first check for original_message
.
Updated by mame (Yusuke Endoh) 7 months ago
we ran into this issue on one of our internal testing applications.
Let me confirm: you are facing this issue when testing dd-trace-rb, not in production, right? I cannot imagine how NameError#message
is called so many times in production.
If this is the case only when testing, how about adding a manual cache mechanism into the test suite of dd-trace-rb? I think this is almost the same as what you are proposing in this ticket at the Ruby level.
module CacheMessage
def message
@__message ||= super
end
end
NameError.prepend(CacheMessage)
byroot (Jean Boussier) wrote in #note-4:
If applicable, I suggest to do what we do in Rails:
In this case, the main overhead is due to receiver#inspect
that returns a long string, not due to did_you_mean (and error_highlight). Actually I can see no significant improvement by running the given benchmark script with --disable-did_you_mean
.
Updated by ivoanjo (Ivo Anjo) 7 months ago
Important note! I'm actually talking about NoMethodError
, not NameError
, so I don't think this issue is caused by the did you mean gem behavior (which I really like).¶
This may have been confusing because both of them share the NameError::message
class where the actual call to #inspect
lives.
Let me confirm: you are facing this issue when testing dd-trace-rb, not in production, right? I cannot imagine how NameError#message is called so many times in production.
Apologies for the confusion. I meant that we were simulating what if a production Rails application had a bug (in this case, mistakenly called a method that had been deleted), and how dd-trace-rb would help customers debug such an issue.
I think it's reasonable assertion to say this can happen -- people will do these kinds of mistakes, and if they're not paying attention to their logs/alarms/error tracking data, their app can be impacted by such an issue for a while until they realize what's going on and have the chance to deploy a fix.
In my view, part of the problem here is that on classes with a big #inspect
, the call to #inspect
ends up being completely useless on Ruby 2.7 and below:
> puts RUBY_DESCRIPTION
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x86_64-darwin19]
> class Foo; def inspect; puts 'inspect called!'; 'A' * 65; end; end
> Foo.new.foo
inspect called!
# => NoMethodError: undefined method `foo' for AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA:Foo
> class Foo; def inspect; puts 'inspect called!'; 'A' * 66; end; end
> Foo.new.foo
inspect called!
# => NoMethodError: undefined method `foo' for #<Foo:0x00007f875719b140>
When I was testing, I wasn't seeing the result of the #inspect
, so it wasn't even occurring to me what was going on (and this is what confused me even further).
I realize now that the limit was removed in 3.0 (https://bugs.ruby-lang.org/issues/16832 and https://github.com/ruby/ruby/pull/3090) so in 3.0 it always shows the whole string, which can be a whole another problem as well, because people may not be expecting a NoMethodError
to contain a small novel in its #message
.
So what I'm asking upstream to consider is:
Could we simplify NoMethodError
to not include the #inspect
? According to the PR above, the previous limit was in place for about 24 years and it looks like nobody really complained that they couldn't see their #inspect
on objects which had a complex one.
Updated by byroot (Jean Boussier) 7 months ago
I'm actually talking about NoMethodError, not NameError, so I don't think this issue is caused by the did you mean gem behavior
No the confusion is mine, I assumed it was a did_you_mean
perf issue without fully digging into your repro because I ran into those previously.
Could we simplify NoMethodError to not include the #inspect?
I'm personally not against it, e.g:
>> {foo: 42}.bar
(irb):2:in `<main>': undefined method `bar' for {:foo=>42}:Hash (NoMethodError)
The content of the hash here isn't really helpful, in the context of a NoMethodError
all I really care about is the type of the object, so why not. I might be missing some use cases though.
the previous limit was in place for about 24 years and it looks like nobody really complained that they couldn't see their #inspect on objects which had a complex one.
Well I for one complained about that limit, hence why I opened the ticket.
Also note that the limit would still have called #inspect
, just not used the result, so unless I'm missing something, you could truncate Error#message
yourself and have roughly the same performance.
And in the context of a library reporting error, you might have to deal with arbitrary errors raised by the application code, so you can't assume all errors will have a reasonably small message
anyway.
Updated by ivoanjo (Ivo Anjo) 7 months ago
Well I for one complained about that limit, hence why I opened the ticket.
Right! Thanks for pointing it out.
Also note that the limit would still have called #inspect, just not used the result, so unless I'm missing something, you could truncate Error#message yourself and have roughly the same performance.
Good point. What I meant was that for a long time it seemed OK to not get the #inspect
information some of the time, so I was trying to use that to make a case that perhaps the default should not be to use #inspect
, not to use the inspect always :)
One additional note is that if one needs it, it's possible to get the #inspect
information because the exception exposes the reference to the original object:
[1] pry(main)> exception = begin; "hello".foo; rescue => e; e; end
=> #<NoMethodError: undefined method `foo' for "hello":String>
[3] pry(main)> exception.receiver
=> "hello"
[4] pry(main)> exception.receiver.inspect
=> "\"hello\""
Updated by byroot (Jean Boussier) 7 months ago
I was trying to use that to make a case that perhaps the default should not be to use #inspect, not to use the inspect always :)
Not sure if I got you right, but in my opinion the old behavior of only using it if it was smaller than 65 chars long was quite horrible and confusing, so I don't think simply going back to this is a good idea.
Again, I can get behind the idea of not including receiver.inspect
in NoMethodError#message
, but the question now is what to include exactly? The class with some special casing for true/false/nil like @Eregon (Benoit Daloze) suggested?
What about arrays, strings, hash etc? Is there any point including their representation?
Updated by ivoanjo (Ivo Anjo) 7 months ago
My suggestion is indeed to only include the class + perhaps the pointer id or the object id for regular user classes, and then special case true/false/nil and perhaps String and symbols.
Updated by Eregon (Benoit Daloze) 7 months ago
Re special casing, true/false/nil are typically already special-cased for exception messages (because #<TrueClass:0x0000000000000014>
is too confusing, everyone expects true
or true:TrueClass
, which is BTW the current formatting: undefined method 'foo' for true:TrueClass (NoMethodError)
).
So it's like Kernel.instance_method(:to_s).bind_call(receiver)
, except for true/false/nil, and except for modules, where we get the name (String
instead of #<Class:0x000000000246a828>
).
I would not special-case anything else, the smaller that list the better. The only reason to special case true/false/nil/Module is because otherwise it's too confusing.
(Showing a String's contents is a good counter example, it can be really long and is not useful for a NoMethodError)
Code is likely clearer:
# Already the case, nothing to change for those :)
true.foo # => undefined method `foo' for true:TrueClass (NoMethodError)
false.foo # => undefined method `foo' for false:FalseClass (NoMethodError)
nil.foo # => undefined method `foo' for nil:NilClass (NoMethodError)
Object.new.foo # => undefined method `foo' for #<Object:0x0000000002f3afc8> (NoMethodError)
String.foo # => undefined method `foo' for String:Class (NoMethodError)
# Suggested change:
"foobar".foo
# old => undefined method `foo' for "foobar":String (NoMethodError)
# new => undefined method `foo' for #<String:0x0000000002ffaaa8> (NoMethodError)
(the String could be very long, same for other object with long or expensive #inspect)
Object.new.instance_exec { @a = 42; self.foo }
# old => undefined method `foo' for #<Object:0x00000000025e6030 @a=42> (NoMethodError)
# new => undefined method `foo' for #<Object:0x00000000025e6030> (NoMethodError)
(printing ivars could lead to very long and slow output due to recursive #inspect)
Updated by ivoanjo (Ivo Anjo) 7 months ago
Thanks Benoit for stating in very clear and exact terms what I was trying to express :)
Updated by mame (Yusuke Endoh) 6 months ago
- Related to Feature #6733: New inspect framework added
Updated by mame (Yusuke Endoh) 6 months ago
@matz (Yukihiro Matsumoto) and @naruse (Yui NARUSE) said let's try any_to_s
approach after Ruby 3.1 is released, towards Ruby 3.2.
We need to handle some basic classes specially rather than true/false/nil. @akr (Akira Tanaka) pointed showing #<Class:0xXXXXXXXXXXXXXXXX>
for a class is rather confusing.
class Foo
privatee
end
#=> current: undefined local variable or method `privvate' for Foo:Class (NameError)
#=> proposal: undefined local variable or method `privvate' for #<Class:0xXXXXXXXXXXXXXXXX> (NameError)
Also, @nobu (Nobuyoshi Nakada) said #<Integer:0xXXXXXXXXXXXXXXXX>
would be annoying. Maybe we need to use the new behavior actually for a while, feel how stressful it is, and adjust the representation.
Updated by duerst (Martin Dürst) 6 months ago
mame (Yusuke Endoh) wrote in #note-14:
#=> current: undefined local variable or method
privvate' for Foo:Class (NameError) #=> proposal: undefined local variable or method
privvate' for #Class:0xXXXXXXXXXXXXXXXX (NameError)Also, @nobu said `#<Integer:0xXXXXXXXXXXXXXXXX>` would be annoying. Maybe we need to use the new behavior actually for a while, feel how stressful it is, and adjust the representation.
I'm very sure we do not need a trial period to know that #<Class:0xXXXXXXXXXXXXXXXX>
or #<Integer:0xXXXXXXXXXXXXXXXX>
are a bad idea. The trial period may be needed to find all the other cases.
Updated by Eregon (Benoit Daloze) 6 months ago
I think #<Integer:0xXXXXXXXXXXXXXXXX>
is actually not really problematic in this context, because the missing method is a method not defined on the Integer class (or above), so the useful information is there.
But I agree it's confusing/unfamiliar output.
Maybe we should call #inspect
if it's a "basic=core & not redefined" method, except for Kernel#inspect, or just hardcode key core classes.
Or another way to say it is it's fine to call a known #inspect
if it doesn't recurse and print other objects.
We should use Module#inspect/name, that was already said in https://bugs.ruby-lang.org/issues/18285#note-11.
modules can and often have a singleton class, so there it is important to show instance-specific information.
Modules are already a special case in name error message IIRC.
Updated by Eregon (Benoit Daloze) 6 months ago
- Related to Bug #6291: Backtrace printout halts for an extremely long time when large amounts of data are allocated added
Updated by Eregon (Benoit Daloze) 6 months ago
- Related to Feature #6783: Infinite loop in inspect, not overriding inspect, to_s, and no known circular references. Stepping into inspect in debugger locks it up with 100% CPU. added
Updated by Dan0042 (Daniel DeLorme) 6 months ago
The simplifications are all super good ideas, but I'd like to know why no one seems to be considering suggestion #2 "Cache result of calling #message after the first call."
Memoizing the message seems a very low-hanging fruit for performance improvement.
the previous limit was in place for about 24 years and it looks like nobody really complained that they couldn't see their #inspect on objects which had a complex one.
Well I for one complained about that limit, hence why I opened the ticket.
But #16832 was about having a meaningful error message for missing constants. No one considered having a megabyte-sized message because of a NoMethodError on megabyte-sized strings. I think this is a bug in 3.0 that should be fixed and backported.
Updated by mame (Yusuke Endoh) 4 months ago
- Tracker changed from Misc to Feature
Discussed at the dev-meeting but no conclusion.
@akr (Akira Tanaka) said the part 0xXXXXXXXXXXXXXXXX
is almost useless, and roughly counter-proposed the following.
42.time #=> undefined method `time' for object Integer (NoMethodError)
class Foo
privatee #=> undefined local variable or method 'privatee' for class Foo (NoMethodError)
end
def (o=Object.new).foo
end
o.bar #=> undefined method `bar' for extended object Object (NoMethodError)
The phrase object Foo
might be improved to, say, an instance of Foo
.
Updated by Dan0042 (Daniel DeLorme) 4 months ago
"undefined method 'time' for object Integer" is grammatically incorrect.
It would need to be "undefined method 'time' for Integer object".
Or "instance of Integer" or "object of type Integer".
Updated by Eregon (Benoit Daloze) 4 months ago
I think Ruby users already know well what #<MyObject:0xXXXXXXXXXXXXXXXX>
means, and using another naming is likely more confusing.
Either way it will need some getting used to as it's new output, and then it will just be natural.
There are already places using rb_any_to_s()
(e.g., when inspect/to_s don't return a String, etc) and this would just be another.
That said, for MyObject instance
/for module Foo
/for class Bar
seems fine and simple enough if that's deemed better.
"Integer object" or so seems confusing, so the instance terminology seems better.