Project

General

Profile

Actions

Feature #20108

closed

Introduction of Happy Eyeballs Version 2 (RFC8305) in Socket.tcp

Added by shioimm (Misaki Shioi) 4 months ago. Updated 2 months ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:115985]

Description

This is an implementation of Happy Eyeballs version 2 (RFC 8305) in Socket.tcp.

Background

Currently, Socket.tcp synchronously resolves names and makes connection attempts with Addrinfo::foreach.
This implementation has the following two problems.

  1. In hostname resolution, the program stops until the DNS server responds to all DNS queries.
  2. In a connection attempt, while an IP address is trying to connect to the destination host and is taking time, the program stops, and other resolved IP addresses cannot try to connect.

Proposal

"Happy Eyeballs" (RFC 8305) is an algorithm to solve this kind of problem. It avoids delays to the user whenever possible and also uses IPv6 preferentially.
I implemented it into Socket.tcp by using Addrinfo.getaddrinfo in each thread spawned per address family to resolve the hostname asynchronously, and using Socket::connect_nonblock to try to connect with multiple addrinfo in parallel.

See https://github.com/ruby/ruby/pull/9374

Outcome

This change eliminates a fatal defect in the following cases.

Case 1. One of the A or AAAA DNS queries does not return

require 'socket'

class Addrinfo
  class << self
    # Current Socket.tcp depends on foreach
    def foreach(nodename, service, family=nil, socktype=nil, protocol=nil, flags=nil, timeout: nil, &block)
      getaddrinfo(nodename, service, Socket::AF_INET6, socktype, protocol, flags, timeout: timeout)
        .concat(getaddrinfo(nodename, service, Socket::AF_INET, socktype, protocol, flags, timeout: timeout))
        .each(&block)
    end

    def getaddrinfo(_, _, family, *_)
      case family
      when Socket::AF_INET6 then sleep
      when Socket::AF_INET then [Addrinfo.tcp("127.0.0.1", 4567)]
      end
    end
  end
end

Socket.tcp("localhost", 4567)

Because the current Socket.tcp cannot resolve IPv6 names, the program stops in this case. It cannot start to connect with IPv4 address.
Though Socket.tcp with HEv2 can promptly start a connection attempt with IPv4 address in this case.

Case 2. Server does not promptly return ack for syn of either IPv4 / IPv6 address family

require 'socket'

fork do
  socket = Socket.new(Socket::AF_INET6, :STREAM)
  socket.setsockopt(:SOCKET, :REUSEADDR, true)
  socket.bind(Socket.pack_sockaddr_in(4567, '::1'))
  sleep
  socket.listen(1)
  connection, _ = socket.accept
  connection.close
  socket.close
end

fork do
  socket = Socket.new(Socket::AF_INET, :STREAM)
  socket.setsockopt(:SOCKET, :REUSEADDR, true)
  socket.bind(Socket.pack_sockaddr_in(4567, '127.0.0.1'))
  socket.listen(1)
  connection, _ = socket.accept
  connection.close
  socket.close
end

Socket.tcp("localhost", 4567)

The current Socket.tcp tries to connect serially, so when its first name resolves an IPv6 address and initiates a connection to an IPv6 server, this server does not return an ACK, and the program stops.
Though Socket.tcp with HEv2 starts to connect sequentially and in parallel so a connection can be established promptly at the socket that attempted to connect to the IPv4 server.

In exchange, the performance of Socket.tcp with HEv2 will be degraded.

100.times { Socket.tcp("www.ruby-lang.org", 80) }
# Socket.tcp (Before) 0.123809
# Socket.tcp (After)  0.224684

This is due to the addition of the creation of IO objects, Thread objects, etc., and calls to IO::select in the implementation.


Related issues 2 (0 open2 closed)

Related to Ruby master - Feature #17525: Implement Happy Eyeballs Version 2 (RFC8305) in Socket.tcpClosedGlass_saga (Masaki Matsushita)Actions
Related to Ruby master - Feature #15628: init_inetsock_internal should fallback to IPv4 if IPv6 is unreachableClosedGlass_saga (Masaki Matsushita)Actions

Updated by shugo (Shugo Maeda) 3 months ago

shioimm (Misaki Shioi) wrote:

In exchange, the performance of Socket.tcp with HEv2 will be degraded.

100.times { Socket.tcp("www.ruby-lang.org", 80) }
# Socket.tcp (Before) 0.123809
# Socket.tcp (After)  0.224684

This is due to the addition of the creation of IO objects, Thread objects, etc., and calls to IO::select in the implementation.

Is there no way to disable Happy Eyeballs?
I'm not sure, but it may have more impact in local networks.

Updated by shioimm (Misaki Shioi) 3 months ago

shugo (Shugo Maeda) wrote in #note-1:

Is there no way to disable Happy Eyeballs?
I'm not sure, but it may have more impact in local networks.

There is no way to disable it; HE is intended to avoid fatal delays. Introducing a way to disable it for performance seems to me to add complexity.
This is the result of 100 runs on the local network.

Before

       user     system      total        real
   0.002695   0.010630   0.013325 (  0.026457)

After

       user     system      total        real
   0.009211   0.024623   0.033834 (  0.034990)

However, in the case of passing an resolved IP address as the first argument of Socket.tcp, the overhead could be reduced.
I will try to implement this. Thank you.

Updated by jeremyevans0 (Jeremy Evans) 3 months ago

If you are running a dual-stack network setup, it seems reasonable to have this enabled by default. However, if you are running a pure-IPv4 or pure-IPv6 network, this adds a lot of overhead for no benefit. I think three changes should be made if you want to enable this by default:

  1. If only one address family is used, Socket.tcp should use the previous implementation (the pull request uses the Happy Eyeballs implementation even if only one address family is used).

  2. Currently in the pull request, if you provide a local_host and local_port, it can determine which address families are in use, which would allow it to determine if there is only one. However, the pull request currently assumes both IPv4 and IPv6 if local_host and local_port are not provided. As few calls to Socket.tcp set local_host and local_port, on non-dual-stack hosts, this assumption is going to be wrong most of the time. You should add a way to determine which address families could be used even if local_host and local_port are not set. Potentially, using Socket.ip_address_list and filtering out loopback and link-local addresses could be used to determine if the machine is actually dual-stack.

  3. There should be a way to disable this and use the previous implementation, for users who do not want it.

Until those changes are made, I think this should be opt-in.

Updated by shugo (Shugo Maeda) 3 months ago

shioimm (Misaki Shioi) wrote in #note-2:

shugo (Shugo Maeda) wrote in #note-1:

Is there no way to disable Happy Eyeballs?
I'm not sure, but it may have more impact in local networks.

There is no way to disable it; HE is intended to avoid fatal delays. Introducing a way to disable it for performance seems to me to add complexity.

Thanks for your reply.
I agree with Jeremy; at least, there should be a way to disable it.

Updated by shioimm (Misaki Shioi) 3 months ago

Thank you for comments.
I agree that I need to reconsider the current implementation, as resolving the hostname in another thread is unneeded when there is only one address family in use.
However, even when there is only one address family, there may still be multiple IP addresses available. In such cases, it still makes sense to attempt connections in parallel.

Therefore, I would like to modify the current implementation as follows:

  • If there is only one destination address family (namely, when a string representing an IP address, rather than a hostname, is passed as an argument to Socket.tcp), resolve the name on the main thread and start the connection immediately.
  • If it is clear that the host machine is single-stack (namely, if Socket.ip_address_list returns Addrinfo of only the same address family), resolve the name on the main thread and start the connection immediately.

In addition to this, if only one of the address families is used and there is only one IP address as a result of name resolution, I would like to use blocking connect to make the connection.

I will measure again how performance changes under certain conditions as a result of doing this, but even then, do you think we need to provide a way to disable it?

(Added on January 19)
In addition to this, TCPSocket.new has a performance advantage over Socket.tcp, and I think the latter may be preferred, especially for use cases where speed at runtime is important.
I would like to work on introducing Happy Eyeballs to TCPSocket.new in the future, but since TCPSocket.new is implemented in C, I expect it to have less overhead compared to changes made to Socket.tcp.

Updated by shugo (Shugo Maeda) 3 months ago

shioimm (Misaki Shioi) wrote in #note-5:

Thank you for comments.
I agree that I need to reconsider the current implementation, as resolving the hostname in another thread is unneeded when there is only one address family in use.
However, even when there is only one address family, there may still be multiple IP addresses available. In such cases, it still makes sense to attempt connections in parallel.

Maybe, but HE may be slower in such cases as your benchmark shows.
So I think it's better to provide a way to disable HE such as Socket.fast_fallback = false and Socket.tcp(..., fast_fallback: false) (the name fast_fallback came from okhttp).
It would also be useful when HE has an implementation problem.

(Added on January 19)
In addition to this, TCPSocket.new has a performance advantage over Socket.tcp, and I think the latter may be preferred, especially for use cases where speed at runtime is important.
I would like to work on introducing Happy Eyeballs to TCPSocket.new in the future, but since TCPSocket.new is implemented in C, I expect it to have less overhead compared to changes made to Socket.tcp.

Users of libraries such as Net::HTTP, Net::FTP, and Net::IMAP can't use TCPSocket.new easily.

Updated by shugo (Shugo Maeda) 3 months ago

shioimm (Misaki Shioi) wrote in #note-2:

There is no way to disable it; HE is intended to avoid fatal delays. Introducing a way to disable it for performance seems to me to add complexity.

Speaking of complexity, how about to leave the original implementation of Socket.tcp as Socket.tcp_without_fast_fallback and call it when HE is disabled?

@tcp_fast_fallback = true

class <<self
  attr_accessor :tcp_fast_fallback
end

def self.tcp(host, port, local_host = nil, local_port = nil, connect_timeout: nil, resolv_timeout: nil, fast_fallback: tcp_fast_fallback, &block) # :yield: socket
  unless fast_fallback
    return tcp_without_fast_fallback(host, port, local_host, local_port, connect_timeout:, resolv_timeout:, &block)
  end
  ...
end

It's not DRY, but a low-risk way.

My concern is not only performance, but implementation stability.
The HE implementation uses threads, so there may be a problem that rarely occurs.

Updated by shioimm (Misaki Shioi) 3 months ago

shugo (Shugo Maeda) wrote in #note-6:

Users of libraries such as Net::HTTP, Net::FTP, and Net::IMAP can't use TCPSocket.new easily.

Net::HTTP uses TCPSocket.open(new), not Socket.tcp.
https://github.com/ruby/net-http/blob/edc99a54b2c2888759068e2627f2f26c5f505352/lib/net/http.rb#L1607
But I was not aware that Net::FTP and Net::IMAP use Socket.tcp.

Maybe, but HE may be slower in such cases as your benchmark shows.

I would like to measure the performance of the implementation first to see how much slower it is when connected in parallel with a single stack compared to the current implementation (sorry, this version is not implemented yet).
But I share your concerns about the stability of using threads.
For now, I would like to try implementing a version that uses the tcp_fast_fallback option.

Updated by shugo (Shugo Maeda) 3 months ago

shioimm (Misaki Shioi) wrote in #note-8:

shugo (Shugo Maeda) wrote in #note-6:

Users of libraries such as Net::HTTP, Net::FTP, and Net::IMAP can't use TCPSocket.new easily.

Net::HTTP uses TCPSocket.open(new), not Socket.tcp.
https://github.com/ruby/net-http/blob/edc99a54b2c2888759068e2627f2f26c5f505352/lib/net/http.rb#L1607
But I was not aware that Net::FTP and Net::IMAP use Socket.tcp.

Oh, I missed it.
But why not use Socket.tcp in Net::HTTP?
HTTP is one of most important use cases, isn't it?

Maybe, but HE may be slower in such cases as your benchmark shows.

I would like to measure the performance of the implementation first to see how much slower it is when connected in parallel with a single stack compared to the current implementation (sorry, this version is not implemented yet).
But I share your concerns about the stability of using threads.
For now, I would like to try implementing a version that uses the tcp_fast_fallback option.

Thanks for your consideration.
However, it's OK for me to merge PR without the option for now.
Let's experiment it until the release.

Updated by shioimm (Misaki Shioi) 3 months ago

Regarding the content mentioned above, I have made the following changes to the commit:
https://github.com/ruby/ruby/pull/9374/commits/461b75830599408feca086f7f6719b8426008802

  • Improved performance in the case where there is only one address family to be name resolved
  • Improvement performance in the case where there is only one name-resolved IP address.
  • Added fast_fallback option.

Here is the commit.

As a result, the performance changed as follows (please forgive the reduced number of attempts, as I do not intend a DoS attack on ruby-lang.org)

require 'socket'
require 'benchmark'

HOSTNAME = "www.ruby-lang.org"
PORT = 80

ai = Addrinfo.tcp(HOSTNAME, PORT)

Benchmark.bmbm do |x|
  x.report("Domain name") do
    30.times { Socket.tcp(HOSTNAME, PORT).close }
  end

  x.report("IP Address") do
    30.times { Socket.tcp(ai.ip_address, PORT).close }
  end

  x.report("fast_fallback: false") do
    30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close }
  end
end
require 'socket'
require 'benchmark'

HOSTNAME = "www.ruby-lang.org"
PORT = 80

ai = Addrinfo.tcp(HOSTNAME, PORT)

Benchmark.bmbm do |x|
  x.report("Domain name") do
    30.times { Socket.tcp(HOSTNAME, PORT).close }
  end

  x.report("IP Address") do
    30.times { Socket.tcp(ai.ip_address, PORT).close }
  end

  x.report("fast_fallback: false") do
    30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close }
  end
end
                           user     system      total        real
Domain name            0.015567   0.032511   0.048078 (  0.325284)
IP Address             0.004458   0.014219   0.018677 (  0.284361)
fast_fallback: false   0.005869   0.021511   0.027380 (  0.321891)

And this is the measurement result when executed in a single stack environment.

                           user     system      total        real
Domain name            0.007062   0.019276   0.026338 (  1.905775)
IP Address             0.004527   0.012176   0.016703 (  3.051192)
fast_fallback: false   0.005546   0.019426   0.024972 (  1.775798)

The following is the result of the run on Ruby 3.3.0.

(on Dual stack environment)

                 user     system      total        real
Ruby 3.3.0   0.007271   0.027410   0.034681 (  0.472510)

(on Single stack environment)

                 user     system      total        real
Ruby 3.3.0  0.005353   0.018898   0.024251 (  1.774535)

Updated by shugo (Shugo Maeda) 3 months ago

shioimm (Misaki Shioi) wrote in #note-10:

Regarding the content mentioned above, I have made the following changes to the commit:
https://github.com/ruby/ruby/pull/9374/commits/461b75830599408feca086f7f6719b8426008802

  • Improved performance in the case where there is only one address family to be name resolved
  • Improvement performance in the case where there is only one name-resolved IP address.
  • Added fast_fallback option.

Thank you. The fast_fallback option of Socket.tcp and Socket.tcp_fast_fallback look fine for me.

Updated by shioimm (Misaki Shioi) 2 months ago

As previously posted, I was considering a way to avoid name resolution within a thread when the host is single-stack.
However, it's clear that it will take a significant cost to associate with determining whether the host is single-stack, and there will be more additional costs for name resolution if it's not single-stack.
To avoid this, I was considering caching the results of determining whether a host is single-stack or not, but found that this could cause unexpected bugs when switching networks.
Therefore, I have decided to abandon the determination of whether the host is single-stack and proceed with initiating a thread for name resolution unless an IP address is directly specified.

Note that the fast_fallback option remains available.
This PR was merged today. Thank you very much.

Updated by hsbt (Hiroshi SHIBATA) 2 months ago

  • Status changed from Open to Closed
Actions #14

Updated by hsbt (Hiroshi SHIBATA) 28 days ago

  • Related to Feature #17525: Implement Happy Eyeballs Version 2 (RFC8305) in Socket.tcp added
Actions #15

Updated by hsbt (Hiroshi SHIBATA) 28 days ago

  • Related to Feature #15628: init_inetsock_internal should fallback to IPv4 if IPv6 is unreachable added
Actions

Also available in: Atom PDF

Like0
Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0