Project

General

Profile

Actions

Bug #14997

closed

Socket connect timeout exceeds the timeout value for

Added by maciej.mensfeld (Maciej Mensfeld) over 5 years ago. Updated over 3 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:88500]
Tags:

Description

Given a case, where a domain is being resolved to multiple IPs (4 in the following example):

dig debug-xyz.elb.us-east-1.amazonaws.com a

; <<>> DiG 9.10.3-P4-Ubuntu <<>> debug-xyz.elb.us-east-1.amazonaws.com a
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54375
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;debug-xyz.elb.us-east-1.amazonaws.com. IN A

;; ANSWER SECTION:
debug-xyz.elb.us-east-1.amazonaws.com. 60 IN A 172.31.86.79
debug-xyz.elb.us-east-1.amazonaws.com. 60 IN A 172.31.109.24
debug-xyz.elb.us-east-1.amazonaws.com. 60 IN A 172.31.119.55
debug-xyz.elb.us-east-1.amazonaws.com. 60 IN A 172.31.71.167

;; Query time: 4 msec
;; SERVER: 172.31.0.2#53(172.31.0.2)
;; WHEN: Tue Aug 14 13:46:18 UTC 2018
;; MSG SIZE  rcvd: 132

and when connect_timeout is set to a certain value (N), the overall timeout upon non-responsive endpoints that don't immediately throw an exception can reach N * 4.

This can disrupt some time-sensitive systems.

We've experienced it with the following setup:

  • TCP server (event machine) behind an AWS NLB
  • TCP server process goes down behind NLB but NLB is still responsive
  • Socket connect_timeout is set to 100ms
  • AWS NLB keeps the connection in the waiting state hoping that the service behind it will get back to normal (but it doesn't)
  • Ruby timeouts after 100ms
  • Ruby tries to connect to the next IP from the pool (AWS NLB again)
  • Due to 4 hosts resolving, the overall timeout is 400ms.

Not sure whether this should be qualified as a bug or a feature, but I believe it should be definitely documented or there should be an option to "hard" block this limit.

Here's the code actually responsible for this behavior: https://github.com/ruby/ruby/blob/trunk/ext/socket/lib/socket.rb#L631-L664


Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #15553: Addrinfo.getaddrinfo supports timeoutClosedGlass_saga (Masaki Matsushita)Actions
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0