Feature #20108
closedIntroduction of Happy Eyeballs Version 2 (RFC8305) in Socket.tcp
Description
This is an implementation of Happy Eyeballs version 2 (RFC 8305) in Socket.tcp.
Background¶
Currently, Socket.tcp
synchronously resolves names and makes connection attempts with Addrinfo::foreach.
This implementation has the following two problems.
- In hostname resolution, the program stops until the DNS server responds to all DNS queries.
- In a connection attempt, while an IP address is trying to connect to the destination host and is taking time, the program stops, and other resolved IP addresses cannot try to connect.
Proposal¶
"Happy Eyeballs" (RFC 8305) is an algorithm to solve this kind of problem. It avoids delays to the user whenever possible and also uses IPv6 preferentially.
I implemented it into Socket.tcp
by using Addrinfo.getaddrinfo
in each thread spawned per address family to resolve the hostname asynchronously, and using Socket::connect_nonblock
to try to connect with multiple addrinfo in parallel.
See https://github.com/ruby/ruby/pull/9374
Outcome¶
This change eliminates a fatal defect in the following cases.
Case 1. One of the A or AAAA DNS queries does not return¶
require 'socket'
class Addrinfo
class << self
# Current Socket.tcp depends on foreach
def foreach(nodename, service, family=nil, socktype=nil, protocol=nil, flags=nil, timeout: nil, &block)
getaddrinfo(nodename, service, Socket::AF_INET6, socktype, protocol, flags, timeout: timeout)
.concat(getaddrinfo(nodename, service, Socket::AF_INET, socktype, protocol, flags, timeout: timeout))
.each(&block)
end
def getaddrinfo(_, _, family, *_)
case family
when Socket::AF_INET6 then sleep
when Socket::AF_INET then [Addrinfo.tcp("127.0.0.1", 4567)]
end
end
end
end
Socket.tcp("localhost", 4567)
Because the current Socket.tcp
cannot resolve IPv6 names, the program stops in this case. It cannot start to connect with IPv4 address.
Though Socket.tcp
with HEv2 can promptly start a connection attempt with IPv4 address in this case.
Case 2. Server does not promptly return ack for syn of either IPv4 / IPv6 address family¶
require 'socket'
fork do
socket = Socket.new(Socket::AF_INET6, :STREAM)
socket.setsockopt(:SOCKET, :REUSEADDR, true)
socket.bind(Socket.pack_sockaddr_in(4567, '::1'))
sleep
socket.listen(1)
connection, _ = socket.accept
connection.close
socket.close
end
fork do
socket = Socket.new(Socket::AF_INET, :STREAM)
socket.setsockopt(:SOCKET, :REUSEADDR, true)
socket.bind(Socket.pack_sockaddr_in(4567, '127.0.0.1'))
socket.listen(1)
connection, _ = socket.accept
connection.close
socket.close
end
Socket.tcp("localhost", 4567)
The current Socket.tcp
tries to connect serially, so when its first name resolves an IPv6 address and initiates a connection to an IPv6 server, this server does not return an ACK, and the program stops.
Though Socket.tcp
with HEv2 starts to connect sequentially and in parallel so a connection can be established promptly at the socket that attempted to connect to the IPv4 server.
In exchange, the performance of Socket.tcp
with HEv2 will be degraded.
100.times { Socket.tcp("www.ruby-lang.org", 80) }
# Socket.tcp (Before) 0.123809
# Socket.tcp (After) 0.224684
This is due to the addition of the creation of IO objects, Thread objects, etc., and calls to IO::select
in the implementation.
Updated by shugo (Shugo Maeda) 10 months ago
shioimm (Misaki Shioi) wrote:
In exchange, the performance of
Socket.tcp
with HEv2 will be degraded.100.times { Socket.tcp("www.ruby-lang.org", 80) } # Socket.tcp (Before) 0.123809 # Socket.tcp (After) 0.224684
This is due to the addition of the creation of IO objects, Thread objects, etc., and calls to
IO::select
in the implementation.
Is there no way to disable Happy Eyeballs?
I'm not sure, but it may have more impact in local networks.
Updated by shioimm (Misaki Shioi) 10 months ago
shugo (Shugo Maeda) wrote in #note-1:
Is there no way to disable Happy Eyeballs?
I'm not sure, but it may have more impact in local networks.
There is no way to disable it; HE is intended to avoid fatal delays. Introducing a way to disable it for performance seems to me to add complexity.
This is the result of 100 runs on the local network.
Before
user system total real
0.002695 0.010630 0.013325 ( 0.026457)
After
user system total real
0.009211 0.024623 0.033834 ( 0.034990)
However, in the case of passing an resolved IP address as the first argument of Socket.tcp
, the overhead could be reduced.
I will try to implement this. Thank you.
Updated by jeremyevans0 (Jeremy Evans) 10 months ago
If you are running a dual-stack network setup, it seems reasonable to have this enabled by default. However, if you are running a pure-IPv4 or pure-IPv6 network, this adds a lot of overhead for no benefit. I think three changes should be made if you want to enable this by default:
-
If only one address family is used, Socket.tcp should use the previous implementation (the pull request uses the Happy Eyeballs implementation even if only one address family is used).
-
Currently in the pull request, if you provide a
local_host
andlocal_port
, it can determine which address families are in use, which would allow it to determine if there is only one. However, the pull request currently assumes both IPv4 and IPv6 iflocal_host
andlocal_port
are not provided. As few calls toSocket.tcp
setlocal_host
andlocal_port
, on non-dual-stack hosts, this assumption is going to be wrong most of the time. You should add a way to determine which address families could be used even iflocal_host
andlocal_port
are not set. Potentially, usingSocket.ip_address_list
and filtering out loopback and link-local addresses could be used to determine if the machine is actually dual-stack. -
There should be a way to disable this and use the previous implementation, for users who do not want it.
Until those changes are made, I think this should be opt-in.
Updated by shugo (Shugo Maeda) 10 months ago
shioimm (Misaki Shioi) wrote in #note-2:
shugo (Shugo Maeda) wrote in #note-1:
Is there no way to disable Happy Eyeballs?
I'm not sure, but it may have more impact in local networks.There is no way to disable it; HE is intended to avoid fatal delays. Introducing a way to disable it for performance seems to me to add complexity.
Thanks for your reply.
I agree with Jeremy; at least, there should be a way to disable it.
Updated by shioimm (Misaki Shioi) 10 months ago
Thank you for comments.
I agree that I need to reconsider the current implementation, as resolving the hostname in another thread is unneeded when there is only one address family in use.
However, even when there is only one address family, there may still be multiple IP addresses available. In such cases, it still makes sense to attempt connections in parallel.
Therefore, I would like to modify the current implementation as follows:
- If there is only one destination address family (namely, when a string representing an IP address, rather than a hostname, is passed as an argument to
Socket.tcp
), resolve the name on the main thread and start the connection immediately. - If it is clear that the host machine is single-stack (namely, if
Socket.ip_address_list
returnsAddrinfo
of only the same address family), resolve the name on the main thread and start the connection immediately.
In addition to this, if only one of the address families is used and there is only one IP address as a result of name resolution, I would like to use blocking connect to make the connection.
I will measure again how performance changes under certain conditions as a result of doing this, but even then, do you think we need to provide a way to disable it?
(Added on January 19)
In addition to this, TCPSocket.new
has a performance advantage over Socket.tcp
, and I think the latter may be preferred, especially for use cases where speed at runtime is important.
I would like to work on introducing Happy Eyeballs to TCPSocket.new
in the future, but since TCPSocket.new
is implemented in C, I expect it to have less overhead compared to changes made to Socket.tcp
.
Updated by shugo (Shugo Maeda) 10 months ago
shioimm (Misaki Shioi) wrote in #note-5:
Thank you for comments.
I agree that I need to reconsider the current implementation, as resolving the hostname in another thread is unneeded when there is only one address family in use.
However, even when there is only one address family, there may still be multiple IP addresses available. In such cases, it still makes sense to attempt connections in parallel.
Maybe, but HE may be slower in such cases as your benchmark shows.
So I think it's better to provide a way to disable HE such as Socket.fast_fallback = false
and Socket.tcp(..., fast_fallback: false)
(the name fast_fallback came from okhttp).
It would also be useful when HE has an implementation problem.
(Added on January 19)
In addition to this,TCPSocket.new
has a performance advantage overSocket.tcp
, and I think the latter may be preferred, especially for use cases where speed at runtime is important.
I would like to work on introducing Happy Eyeballs toTCPSocket.new
in the future, but sinceTCPSocket.new
is implemented in C, I expect it to have less overhead compared to changes made toSocket.tcp
.
Users of libraries such as Net::HTTP, Net::FTP, and Net::IMAP can't use TCPSocket.new easily.
Updated by shugo (Shugo Maeda) 10 months ago
shioimm (Misaki Shioi) wrote in #note-2:
There is no way to disable it; HE is intended to avoid fatal delays. Introducing a way to disable it for performance seems to me to add complexity.
Speaking of complexity, how about to leave the original implementation of Socket.tcp as Socket.tcp_without_fast_fallback and call it when HE is disabled?
@tcp_fast_fallback = true
class <<self
attr_accessor :tcp_fast_fallback
end
def self.tcp(host, port, local_host = nil, local_port = nil, connect_timeout: nil, resolv_timeout: nil, fast_fallback: tcp_fast_fallback, &block) # :yield: socket
unless fast_fallback
return tcp_without_fast_fallback(host, port, local_host, local_port, connect_timeout:, resolv_timeout:, &block)
end
...
end
It's not DRY, but a low-risk way.
My concern is not only performance, but implementation stability.
The HE implementation uses threads, so there may be a problem that rarely occurs.
Updated by shioimm (Misaki Shioi) 10 months ago
shugo (Shugo Maeda) wrote in #note-6:
Users of libraries such as Net::HTTP, Net::FTP, and Net::IMAP can't use TCPSocket.new easily.
Net::HTTP uses TCPSocket.open
(new), not Socket.tcp
.
https://github.com/ruby/net-http/blob/edc99a54b2c2888759068e2627f2f26c5f505352/lib/net/http.rb#L1607
But I was not aware that Net::FTP and Net::IMAP use Socket.tcp
.
Maybe, but HE may be slower in such cases as your benchmark shows.
I would like to measure the performance of the implementation first to see how much slower it is when connected in parallel with a single stack compared to the current implementation (sorry, this version is not implemented yet).
But I share your concerns about the stability of using threads.
For now, I would like to try implementing a version that uses the tcp_fast_fallback
option.
Updated by shugo (Shugo Maeda) 10 months ago
shioimm (Misaki Shioi) wrote in #note-8:
shugo (Shugo Maeda) wrote in #note-6:
Users of libraries such as Net::HTTP, Net::FTP, and Net::IMAP can't use TCPSocket.new easily.
Net::HTTP uses
TCPSocket.open
(new), notSocket.tcp
.
https://github.com/ruby/net-http/blob/edc99a54b2c2888759068e2627f2f26c5f505352/lib/net/http.rb#L1607
But I was not aware that Net::FTP and Net::IMAP useSocket.tcp
.
Oh, I missed it.
But why not use Socket.tcp in Net::HTTP?
HTTP is one of most important use cases, isn't it?
Maybe, but HE may be slower in such cases as your benchmark shows.
I would like to measure the performance of the implementation first to see how much slower it is when connected in parallel with a single stack compared to the current implementation (sorry, this version is not implemented yet).
But I share your concerns about the stability of using threads.
For now, I would like to try implementing a version that uses thetcp_fast_fallback
option.
Thanks for your consideration.
However, it's OK for me to merge PR without the option for now.
Let's experiment it until the release.
Updated by shioimm (Misaki Shioi) 9 months ago
Regarding the content mentioned above, I have made the following changes to the commit:
https://github.com/ruby/ruby/pull/9374/commits/461b75830599408feca086f7f6719b8426008802
- Improved performance in the case where there is only one address family to be name resolved
- Improvement performance in the case where there is only one name-resolved IP address.
- Added fast_fallback option.
Here is the commit.
As a result, the performance changed as follows (please forgive the reduced number of attempts, as I do not intend a DoS attack on ruby-lang.org)
require 'socket'
require 'benchmark'
HOSTNAME = "www.ruby-lang.org"
PORT = 80
ai = Addrinfo.tcp(HOSTNAME, PORT)
Benchmark.bmbm do |x|
x.report("Domain name") do
30.times { Socket.tcp(HOSTNAME, PORT).close }
end
x.report("IP Address") do
30.times { Socket.tcp(ai.ip_address, PORT).close }
end
x.report("fast_fallback: false") do
30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close }
end
end
require 'socket'
require 'benchmark'
HOSTNAME = "www.ruby-lang.org"
PORT = 80
ai = Addrinfo.tcp(HOSTNAME, PORT)
Benchmark.bmbm do |x|
x.report("Domain name") do
30.times { Socket.tcp(HOSTNAME, PORT).close }
end
x.report("IP Address") do
30.times { Socket.tcp(ai.ip_address, PORT).close }
end
x.report("fast_fallback: false") do
30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close }
end
end
user system total real
Domain name 0.015567 0.032511 0.048078 ( 0.325284)
IP Address 0.004458 0.014219 0.018677 ( 0.284361)
fast_fallback: false 0.005869 0.021511 0.027380 ( 0.321891)
And this is the measurement result when executed in a single stack environment.
user system total real
Domain name 0.007062 0.019276 0.026338 ( 1.905775)
IP Address 0.004527 0.012176 0.016703 ( 3.051192)
fast_fallback: false 0.005546 0.019426 0.024972 ( 1.775798)
The following is the result of the run on Ruby 3.3.0.
(on Dual stack environment)
user system total real
Ruby 3.3.0 0.007271 0.027410 0.034681 ( 0.472510)
(on Single stack environment)
user system total real
Ruby 3.3.0 0.005353 0.018898 0.024251 ( 1.774535)
Updated by shugo (Shugo Maeda) 9 months ago
shioimm (Misaki Shioi) wrote in #note-10:
Regarding the content mentioned above, I have made the following changes to the commit:
https://github.com/ruby/ruby/pull/9374/commits/461b75830599408feca086f7f6719b8426008802
- Improved performance in the case where there is only one address family to be name resolved
- Improvement performance in the case where there is only one name-resolved IP address.
- Added fast_fallback option.
Thank you. The fast_fallback option of Socket.tcp and Socket.tcp_fast_fallback look fine for me.
Updated by shioimm (Misaki Shioi) 8 months ago
As previously posted, I was considering a way to avoid name resolution within a thread when the host is single-stack.
However, it's clear that it will take a significant cost to associate with determining whether the host is single-stack, and there will be more additional costs for name resolution if it's not single-stack.
To avoid this, I was considering caching the results of determining whether a host is single-stack or not, but found that this could cause unexpected bugs when switching networks.
Therefore, I have decided to abandon the determination of whether the host is single-stack and proceed with initiating a thread for name resolution unless an IP address is directly specified.
Note that the fast_fallback option remains available.
This PR was merged today. Thank you very much.
Updated by hsbt (Hiroshi SHIBATA) 8 months ago
- Status changed from Open to Closed
This proposal has been merged at https://github.com/ruby/ruby/commit/9ec342e07df6aa5e2c2e9003517753a2f1b508fd
Updated by hsbt (Hiroshi SHIBATA) 7 months ago
- Related to Feature #17525: Implement Happy Eyeballs Version 2 (RFC8305) in Socket.tcp added
Updated by hsbt (Hiroshi SHIBATA) 7 months ago
- Related to Feature #15628: init_inetsock_internal should fallback to IPv4 if IPv6 is unreachable added