Feature #20335
closed`Thread.each_caller_location` should accept the same arguments as `caller` and `caller_locations`
Description
Thread.each_caller_location
was added to Ruby 3.2 as part of [Feature #16663] and is a very useful API for emitting warnings with a proper source location and similar use cases.
However in many of the cases where I used it, or seen it used, it was needed to skip the first, or a couple frames:
Examples:
def self.find_caller
skiped_first = false
Thread.each_caller_location do |loc|
unless skiped_first
skiped_first = true
next
end
next if loc.path&.start_with?("<internal:")
return loc if yield(loc)
end
nil
end
@fxn (Xavier Noria) 's PR: https://github.com/ruby/ruby/blob/9c2e686719a5a4df5ea0b8a3b6a373ca6003c229/lib/bundled_gems.rb#L140-L146
frames_to_skip = 2
location = nil
Thread.each_caller_location do |cl|
if frames_to_skip >= 1
frames_to_skip -= 1
next
end
# snipp...
Proposal¶
I think it would be very useful if Thread.each_caller_location
accepted the same arguments as caller
and caller_locations
:
#each_caller_location(start = 1, length = nil)
#each_caller_location(range)
@jeremyevans0 (Jeremy Evans) what do you think?
Updated by byroot (Jean Boussier) 8 months ago
- Related to Feature #16663: Add block or filtered forms of Kernel#caller to allow early bail-out added
Updated by jeremyevans0 (Jeremy Evans) 8 months ago
Seems fine to me. Looking at the code, I don't think it would be difficult to implement.
Updated by Eregon (Benoit Daloze) 8 months ago ยท Edited
I think omitting the first N frames is useful.
However I see little reason to have a length
or Range
, one can just break/return
out of the Thread.each_caller_location
, and it seems pretty rare to need that.
So I think it's best to only add start = 1
for now.
In fact the Range
argument is already an issue for caller_locations
: it requires iterating all the stack to find how many frames there are to know which frames 0..-2
should use.
That implies it needs to count how many frames are available, which is inefficient.
So we certainly wouldn't want that behavior for each_caller_location
as it would ruin a significant part of its performance advantage.
(we should maybe forbid this form for caller_locations
, but that should be a separate ticket)
Updated by Eregon (Benoit Daloze) 8 months ago
Ah I mistyped, caller_locations(20)
is of course just setting start, caller_locations(0, 20)
has no problem, I'll edit my post on Redmine to use the Range form, which is a problem.
Updated by byroot (Jean Boussier) 8 months ago
I think omitting the first N frames is useful.
Agreed, that's the only one I want. The reason I started the issue by asking for the same parameters is purely consistency.
Now, if there are valid reasons not to support some (like you exposed), I'm totally fine with deliberately not supporting them. But what I'm trying to say is we should start with the same feature set and then selectively remove the one that may be problematic.
So I suppose at that stage the proposal would be:
Thread.each_caller_location(1) # skip first frame
Thread.each_caller_location(1, 20) # skip first frame, yield the next 20 frames
Thread.each_caller_location(1..20) # ArgumentError
Is that correct?
Updated by Eregon (Benoit Daloze) 8 months ago
Right, Range is problematic, and only Range with positive indices seems an weird restriction so I think better no Range.
Thread.each_caller_location(1, 20)
I think there are basically no use cases for this, so I wouldn't add it to keep things simple.
If one wants to e.g. detect the boundary between an app and a gem they should look at locations, not hardcode some frame depth limit.
It's kind of the purpose of Thread.each_caller_location
to be able to dynamically detect the limit, and then an extra Integer limit feels redundant.
Could you change your proposal to only add start = 1
, unless you have/find some good use cases for Thread.each_caller_location(1, 20)
?
Updated by nobu (Nobuyoshi Nakada) 7 months ago
Updated by matz (Yukihiro Matsumoto) 7 months ago
Accepted. Thank you.
Matz.
Updated by nobu (Nobuyoshi Nakada) 7 months ago
- Status changed from Open to Closed
Applied in changeset git|09638741ba4d9547a0e48af8c767744fb1d7f68d.
[Feature #20335] Thread.each_caller_location
arguments
Accecpt the same arguments as caller
and caller_locations
.
Updated by mame (Yusuke Endoh) 7 months ago
A little addition: the dicussion about length and range was considered (not ignored), but @matz (Yukihiro Matsumoto) said he didn't see the need to make it inconsistent with Kernel#caller and Kernel#caller_locations in this case.
Updated by Eregon (Benoit Daloze) 7 months ago
mame (Yusuke Endoh) wrote in #note-11:
A little addition: the dicussion about length and range was considered (not ignored), but @matz (Yukihiro Matsumoto) said he didn't see the need to make it inconsistent with Kernel#caller and Kernel#caller_locations in this case.
Thank you for the precision.
I think it is likely other Ruby implementations raise ArgumentError if the Range begin or end is negative, because it makes no sense (IOW wrong usage) to use that with Thread.each_caller_location
:
it forces to walk the entire stack to count the number of frames when the point of Thread.each_caller_location
is to not walk the entire stack but just the parts being yielded to the block.
I would suggest CRuby does the same for consistency.
Updated by byroot (Jean Boussier) 7 months ago
the point of Thread.each_caller_location is to not walk the entire stack
Is it to not walk the entire stack, or simply not to generate the Backtrace::Location
objects for the entire stack? My guess is that the later is most of the overhead, and the walking, while unfortunate, isn't that bad. But I haven't measured.
Updated by Eregon (Benoit Daloze) 7 months ago
@byroot (Jean Boussier) It's both, but for TruffleRuby I'm pretty sure the first matters much more.
These few allocations are no big deal on the JVM, so I would expect similar on JRuby.
Walking the stack is also obviously slower with a deeper stack (e.g. Rails), while the object allocations are pretty obvious and simply the number of times the block yielded before the user break
s/return
s.
I also don't see any valid use case for Thread.each_caller_location
with a Range with negative begin or end, so that's why I think it's better to prevent it altogether instead of making it a performance trap (i.e. it's O(stack depth)
instead of the expected O(number of locations yielded)
).
(the only way to make that fast would be to maintain a stack depth field in every frame, but that's overhead in time and memory for every call just for this feature, which seems clearly unreasonable).
Updated by byroot (Jean Boussier) 7 months ago
I also don't see any valid use case for Thread.each_caller_location with a Range with negative begin or end
It's still a slightly cheaper version of caller_locations(range).each
. I agree that it's not super useful, but I don't think it hurts being there either, and I prefer consistency.