Resolv::DNS::Message.decode hangs after detecting truncation in UDP messages
ruby-core:32407 introduced support for a TCP requestor in fetch_resource if Resolv::DNS::Message.decode returned RCode::NoError and a reply tc of 1.
Unfortunately Resolv::DNS::Message.decode proceeds to attempt to unpack all answers based upon the answer count from the message regardless of truncation which causes exception behaviour which is then silently retried as this is seen as the response to the request never being seen (see begin, ensure).
To avoid this issue I add a return to Message.decode once the truncation is detected.
To patch this in a portable fashion I use the attached monkey patch which causes the truncation to propagate correctly back to fetch_resource and allows it to proceed correctly with the TCP based query.
Updated by iamasmith (Andrew Smith) over 2 years ago
After further consideration I had a look at rfc1035 with regard to the handling of record counts in truncated messages and it discusses ANCOUNT as being the number of answers in the answer section and not necessarily the number of answers that the DNS server knows about.
I'm testing against SkyDNS and this appears to return in my test the 50 records that I added as the number for the ANCOUNT field but does correctly report truncated.
The validity of the change I mention to Ruby perhaps is less correct if SkyDNS is at fault and it is appropriate that it is fixed, however, since Ruby implements the suggested fallback to use TCP the records provided in the UDP response are not available to the caller and therefore the change to handle excessive counts in ANCOUNT on truncated messages seems useful for the growing number of environments using SkyDNS.
I'll compare some others and potentially raise a separate bug if appropriate with the SkyDNS maintainers.