Project

General

Profile

Feature #16476

Socket.getaddrinfo cannot be interrupted by Timeout.timeout

Added by kirs (Kir Shatrov) 10 months ago. Updated 29 days ago.

Status:
Closed
Priority:
Normal
Target version:
[ruby-core:96642]

Description

It seems like the blocking syscall done by Socket.getaddrinfo blocks Ruby VM in a way that Timeout.timeout has no effect.
See reproduction steps in getaddrinfo_interrupt.rb (https://gist.github.com/kirs/00c02ef92e0418578135fe0a6cbd3d7d). This affects all modern Ruby versions, including the latest 2.7.0.

Combined with default 10s resolv timeout on many Linux systems, this can have a very noticeable effect on production Ruby apps being not resilient to slow DNS resolutions, and being unable to fail fast even with Timeout.timeout.

While https://bugs.ruby-lang.org/issues/15553 improves the situation for Addrinfo.getaddrinfo, Socket.getaddrinfo is still blocking the VM and Timeout has no effect.

I'd like to discuss what could be done to make that call non-blocking for threads in Ruby VM.

UPD: looking closer, I can see that Socket.getaddrinfo("www.ruby-lang.org", "http") and Addrinfo.getaddrinfo("www.ruby-lang.org", "http") call non-interruptible getaddrinfo, while Addrinfo.getaddrinfo("www.ruby-lang.org", "http", timeout: 10) calls getaddrinfo_a, which is interruptible:

# interrupts as expected
Timeout.timeout(1) do
  Addrinfo.getaddrinfo("www.ruby-lang.org", "http", timeout: 10)
end

I'd maybe suggest that we try to always use getaddrinfo_a when it's available, including in Socket.getaddrinfo. What downsides that would have?
I'd be happy to work on a patch.


Related issues

Related to Ruby master - Feature #16381: Accept resolv_timeout in Net::HTTPOpenActions
#1

Updated by kirs (Kir Shatrov) 10 months ago

  • Description updated (diff)
#2

Updated by kirs (Kir Shatrov) 10 months ago

  • Description updated (diff)

Updated by Dan0042 (Daniel DeLorme) 10 months ago

+1

This has been an issue for a very long time, and it's often been handled by installing an asynchronous DNS resolver gem, but it would be nice if it "just worked". If it's really as simple as using getaddrinfo_a, that sounds great.

Updated by kirs (Kir Shatrov) 10 months ago

Dan0042 (Daniel DeLorme) wrote:

+1

This has been an issue for a very long time, and it's often been handled by installing an asynchronous DNS resolver gem, but it would be nice if it "just worked". If it's really as simple as using getaddrinfo_a, that sounds great.

Thanks for feedback Daniel!

I've put a PR with the suggested fix: https://github.com/ruby/ruby/pull/2827

#5

Updated by kirs (Kir Shatrov) 9 months ago

  • File deleted (getaddrinfo_interrupt.rb)
#6

Updated by kirs (Kir Shatrov) 9 months ago

  • Description updated (diff)

Updated by mame (Yusuke Endoh) 8 months ago

  • Backport deleted (2.5: UNKNOWN, 2.6: UNKNOWN)
  • ruby -v deleted (ruby 2.7.0p0 (2019-12-25 revision 647ee6f091) [x86_64-linux])
  • Assignee set to Glass_saga (Masaki Matsushita)
  • Status changed from Open to Assigned
  • Tracker changed from Bug to Feature

We discussed this issue at the dev-meeting, and it requires Glass_saga (Masaki Matsushita)'s review.

Note:

  • It is uninterruptable under a platform that getaddrinfo_a is unavailable, but this problem is not only this proposal but also timeout: option of Addrinfo.getaddrinfo().
  • Interruptable version can be implemented without getaddrinfo_a: Creating pthread for getaddrinfo function and pthread_cancel when interrupted. Contribution is welcome.

Updated by kirs (Kir Shatrov) 5 months ago

mame (Yusuke Endoh) wrote in #note-7:

We discussed this issue at the dev-meeting, and it requires Glass_saga (Masaki Matsushita)'s review.

Note:

  • It is uninterruptable under a platform that getaddrinfo_a is unavailable, but this problem is not only this proposal but also timeout: option of Addrinfo.getaddrinfo().
  • Interruptable version can be implemented without getaddrinfo_a: Creating pthread for getaddrinfo function and pthread_cancel when interrupted. Contribution is welcome.

Thanks for feedback!
I've opened https://github.com/ruby/ruby/pull/3171 with the approach you've described. Please let me know if I miss anything.

Updated by Glass_saga (Masaki Matsushita) 2 months ago

  • Target version set to 36
  • Status changed from Assigned to Closed
#10

Updated by Glass_saga (Masaki Matsushita) 2 months ago

#11

Updated by hsbt (Hiroshi SHIBATA) 29 days ago

  • Target version changed from 36 to 3.0

Also available in: Atom PDF