Project

General

Profile

Feature #16381

Accept resolv_timeout in Net::HTTP

Added by kirs (Kir Shatrov) 4 months ago. Updated 4 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:96027]

Description

This is a follow-up to https://bugs.ruby-lang.org/issues/15553 and a successor of https://github.com/ruby/ruby/pull/1806 (the credit to Carl Hörberg).

Unlike https://github.com/ruby/ruby/pull/1806, this patch introduces a separate resolv_timeout Net::HTTP would pass to Socket.tcp.
The idea to have it as a separate value (vs reusing open_timeout) was suggested by Alan Wu. It's helpful in case specifies open_timeout: 1, DNS resolv takes 0.9s and opening TCP connection takes 0.9s, and the total wait time is 1.8s even though the allowed timeout was 1s.

This patch not only makes DNS timeout customizable, but also fixes a bug when wrapping TCPSocket.open into whatever seconds Timeout.timeout would still take 10 seconds because of the nature of blocking resolv operation on many systems (here's a gist to reproduce on Linux: https://gist.github.com/kirs/5f711099b23ddae7a87ebb082ce43f59).

This problem is not hypothetical, it's something we've been seeing in production fairly often: even with open/read timeouts on Net::HTTP as low as a second, the Ruby process would still be blocked for 10s (system's resolv timeout) in case of DNS issues. And on web servers with blocking IO (e.g. Unicorn) this would cause the loss of capacity.


Files

resolv.patch (2.9 KB) resolv.patch kirs (Kir Shatrov), 11/29/2019 01:24 PM
#1

Updated by kirs (Kir Shatrov) 4 months ago

  • Description updated (diff)
#2

Updated by kirs (Kir Shatrov) 4 months ago

  • Description updated (diff)

Updated by alanwu (Alan Wu) 4 months ago

On second thought, I'm not thrilled about adding a new config option like this.
I think name resolution is logically part of opening a socket, so I would expect a Net::OpenTimeout if name resolution takes longer than the specified amount.

On the other hand, it seems that effectively cancelling name resolution is hard to do currently, especially on systems that don't have getaddrinfo_a.

Also available in: Atom PDF