Actions

Copy link

Feature #5064

open

HTTP user-agent class

Added by drbrain (Eric Hodel) almost 14 years ago. Updated over 7 years ago.

Status:

Assigned

Assignee:

matz (Yukihiro Matsumoto)

Target version:

[ruby-core:38295]

Description

Currently there are some problems with Net::HTTP:

Too many ways to use (user confusion)
No automatic support for HTTPS (must conditionally set use_ssl)
No automatic support for HTTPS peer verification (must be manually set)
Single-connection oriented
No support for redirect-following
No support for HTTP/1.1 persistent connection retry (RFC 2616 8.1.4)
No automatic support for HTTP proxies
No automatic support for authentication (must be set per-request)

Additionally the style of the API of Net::HTTP makes it difficult to take advantage of persistent connections. The user has to store the created connection and manually handle restarting the connection if it has timed out or is closed by the server.

RFC 2616 8.1.1 has a large section explaining the benefits of persistent connections, but while Net::HTTP implements persistent connections they could be easier for users to implement with next work.

I've implemented support for many of these additional features of Net::HTTP in various projects and I'd like Ruby to have the features required to make a useful HTTP user-agent built-in.

The agent should have the following responsibilities:

Make or reuse connections based on [host, port, SSL enabled]
Automatically enable SSL for https URIs
Automatically enable SSL peer verification for SSL connections
Limit number of persistent connections per host
Follow redirects
Retry when a persistent connection fails
Automatically configure proxies
Automatically use authentication
Callbacks for various options connect

The agent may add the following responsibilities:

Default headers for all requests
HTTP cookies
Tracking history
Logging

I don't think any of these features are critical as they are implementable by users via callbacks.

The agent would have the following configurable items:

Number of connections per host
Depth of redirects followed
Persistent connection retries (none, HTTP/1.1 (default), always)
Proxy host, port, user, password

I think the class should be called Net::HTTP::Agent.

Basic use would look something like this:

uris = [
URI('http://example/1'),
URI('http://example/2'),
URI('https://secure.example'),
]

agent = Net::HTTP::Agent.new

uris.map do |uri|
agent.get uri # Returns Net::HTTPResponse
end

For special requests a Net::HTTPRequest could be constructed:

req = Net::HTTP::Get.new uri.request_uri

do something special with req¶

agent.request req

The agent should support GET, POST, etc. directly through API methods. I think the API should look something like this:

def get uri_or_string, query = nil, headers = nil

Same for other requests with no body¶

query may be a Hash or String¶

How query param vs query string in URI is used is undecided¶

def post uri_or_string, data, headers = nil

same for other requests with a body¶

data may be a String, IO or Hash¶

How data format is chosen is undecided¶

SSL options, proxy options, timeouts and similar options should exist on Net::HTTP::Agent and be set on new connections as they are made.

I've implemented most of these features in mechanize as Mechanize::HTTP::Agent. The Agent class in mechanize is bigger than is necessary and would need to be cut-down for inclusion in Ruby as Net::HTTP::Agent

https://github.com/tenderlove/mechanize/blob/master/lib/mechanize/http/agent.rb

Mechanize depends on net-http-persistent to provide HTTP/1.1 retry support and connection management:

https://github.com/drbrain/net-http-persistent/blob/master/lib/net/http/persistent.rb

Portions of net-http-persistent should be patches of Net::HTTP, for example #idempotent? #can_retry?, #reset and portions of #request. Other parts (connection management) should be moved to Net::HTTP::Agent.

net-http-persistent provides a separate connection list per thread. I would like Net::HTTP::Agent to be multi-thread friendly but implementing this in another way would be fine.

As an addendum, open-uri and mechanize should be written to take advantage of Net::HTTP::Agent on order to guide useful implementation.

Related issues 2 (1 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0

	Related to Ruby - Feature #5461: Add pipelining to Net::HTTP	Assigned	naruse (Yui NARUSE)				Actions
	Related to Ruby - Feature #6482: Add URI requested to Net::HTTP request and response objects	Closed	naruse (Yui NARUSE)	05/23/2012			Actions

Project

General

Profile

Ruby

Tags

Custom queries

Feature #5064

HTTP user-agent class

do something special with req¶

Same for other requests with no body¶

query may be a Hash or String¶

How query param vs query string in URI is used is undecided¶

same for other requests with a body¶

data may be a String, IO or Hash¶

How data format is chosen is undecided¶

Updated by naruse (Yui NARUSE) almost 14 years ago

Updated by naruse (Yui NARUSE) almost 14 years ago

Updated by drbrain (Eric Hodel) almost 14 years ago

Updated by jrochkind (jonathan rochkind) almost 14 years ago

Updated by drbrain (Eric Hodel) almost 14 years ago

Updated by akr (Akira Tanaka) almost 14 years ago

Ruby invokes getaddrinfo() without GVL since 1.9.2.
This means, even without resolv/replace, DNS lookup doesn't block other
threads if the platform have getaddrinfo().¶

Updated by steveklabnik (Steve Klabnik) almost 14 years ago

Updated by normalperson (Eric Wong) almost 14 years ago

Updated by mame (Yusuke Endoh) over 13 years ago

Updated by duerst (Martin Dürst) over 13 years ago

Updated by drbrain (Eric Hodel) over 13 years ago

Updated by mame (Yusuke Endoh) over 12 years ago

Updated by naruse (Yui NARUSE) over 7 years ago

Project

General

Profile

Ruby

Tags

Custom queries

Feature #5064

HTTP user-agent class

do something special with req¶

Same for other requests with no body¶

query may be a Hash or String¶

How query param vs query string in URI is used is undecided¶

same for other requests with a body¶

data may be a String, IO or Hash¶

How data format is chosen is undecided¶

Updated by naruse (Yui NARUSE) almost 14 years ago

Updated by naruse (Yui NARUSE) almost 14 years ago

Updated by drbrain (Eric Hodel) almost 14 years ago

Updated by jrochkind (jonathan rochkind) almost 14 years ago

Updated by drbrain (Eric Hodel) almost 14 years ago

Updated by akr (Akira Tanaka) almost 14 years ago

Ruby invokes getaddrinfo() without GVL since 1.9.2. This means, even without resolv/replace, DNS lookup doesn't block other threads if the platform have getaddrinfo().¶

Updated by steveklabnik (Steve Klabnik) almost 14 years ago

Updated by normalperson (Eric Wong) almost 14 years ago

Updated by mame (Yusuke Endoh) over 13 years ago

Updated by duerst (Martin Dürst) over 13 years ago

Updated by drbrain (Eric Hodel) over 13 years ago

Updated by mame (Yusuke Endoh) over 12 years ago

Updated by naruse (Yui NARUSE) over 7 years ago

Ruby invokes getaddrinfo() without GVL since 1.9.2.
This means, even without resolv/replace, DNS lookup doesn't block other
threads if the platform have getaddrinfo().¶