Feature #4270
closedResolv does not handle UTF8 domain names.
Description
=begin
Resolv.getaddress(es) cannot handle UTF8 domain names:
Steps to reproduce error:
Resolv.getaddress('∞.com')
Expected result:
174.132.17.93
Actual result:
Encoding::CompatibilityError: incompatible character encodings: UTF-8 and ASCII-8BIT
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:757:in `[]='
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:757:in `sender'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:504:in `block in each_resource'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:1000:in `block (3 levels) in resolv'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:998:in `each'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:998:in `block (2 levels) in resolv'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:997:in `each'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:997:in `block in resolv'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:995:in `each'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:995:in `resolv'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:498:in `each_resource'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:391:in `each_address'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:115:in `block in each_address'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:114:in `each'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:114:in `each_address'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:92:in `getaddress'
from /home/hal/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/resolv.rb:43:in `getaddress'
=end
Updated by sorah (Sorah Fukumori) about 14 years ago
- Status changed from Open to Rejected
=begin
UTF-8 domain names are punycode, so you should encode utf-8 domain name to punycode domain name.
Like
Resolv.getaddress('xn--59g.com')
=end
Updated by postmodern (Hal Brodigan) about 14 years ago
=begin
Charles Nutter (@headius (Charles Nutter)) suggested a way for the code to fail less loudly. https://gist.github.com/775696
If Resolv is going to explicitly not support UTF8 domain names, it should raise a descriptive ArgumentError.
=end
Updated by sorah (Sorah Fukumori) about 14 years ago
=begin
Please request again as feature request.
=end
Updated by nahi (Hiroshi Nakamura) about 14 years ago
- Status changed from Rejected to Open
=begin
Reopening since I moved this to 'Feature'. Isn't it enough?
=end
Updated by sorah (Sorah Fukumori) about 14 years ago
=begin
I forgot moving feature in Redmine..
=end
Updated by usa (Usaku NAKAMURA) about 14 years ago
- Category set to lib
- Status changed from Open to Assigned
- Assignee set to akr (Akira Tanaka)
=begin
=end
Updated by mame (Yusuke Endoh) about 12 years ago
- Description updated (diff)
- Target version set to 2.6
Updated by steakknife (Barry Allard) almost 12 years ago
We've been using a monkey patch based on gnu libidn's functions for rfcs 3490, 3491 & 3492.
Here's an extract of the critical functions (toASCII and toUNICODE), please feel free to hack/fork/comment/etc: https://gist.github.com/5328637 (Unit tests included).
=> Resolv::Unicode.to_ascii('一流大學.中国')
"xn--4gqt5y3xbky5a.xn--fiqs8s"
=>
PING xn--4gqt5y3xbky5a.xn--fiqs8s (158.125.1.208): 56 data bytes
64 bytes from 158.125.1.208: icmp_seq=0 ttl=46 time=168.616 ms
64 bytes from 158.125.1.208: icmp_seq=1 ttl=46 time=163.608 ms
Updated by akr (Akira Tanaka) almost 12 years ago
It is not appropriate to use external library from bundled library such as resolv.rb.
Updated by steakknife (Barry Allard) almost 12 years ago
That was a rough suggestion that works right now, it's definitely not perfect. It makes sense for someone to create an autotools patch that detects libidn, setup lib and include paths and refactor ruby glue code to eliminate idn-ruby dependency (not the older idn gem). The code for libidn is very complicated (feel free to read the RFC's if you like), better to link against it rather than refactor to Ruby because it's a known quantity.
I think first it would need a discussion to decide how/when to perform unicode conversions with minimal breakage, be DRY and predictable.
Updated by steakknife (Barry Allard) almost 12 years ago
For now, I've rolled up some code into a gem: resolv-idn
Updated by akr (Akira Tanaka) almost 12 years ago
- Status changed from Assigned to Closed
It seems this feature is provided by a gem.
So I close this issue now.