diff options
Diffstat (limited to 'doc/draft/draft-ietf-dnsop-respsize-06.txt')
-rw-r--r-- | doc/draft/draft-ietf-dnsop-respsize-06.txt | 640 |
1 files changed, 640 insertions, 0 deletions
diff --git a/doc/draft/draft-ietf-dnsop-respsize-06.txt b/doc/draft/draft-ietf-dnsop-respsize-06.txt new file mode 100644 index 0000000..b041925 --- /dev/null +++ b/doc/draft/draft-ietf-dnsop-respsize-06.txt @@ -0,0 +1,640 @@ + + + + + + + DNSOP Working Group Paul Vixie, ISC + INTERNET-DRAFT Akira Kato, WIDE + <draft-ietf-dnsop-respsize-06.txt> August 2006 + + DNS Referral Response Size Issues + + Status of this Memo + By submitting this Internet-Draft, each author represents that any + applicable patent or other IPR claims of which he or she is aware + have been or will be disclosed, and any of which he or she becomes + aware will be disclosed, in accordance with Section 6 of BCP 79. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF), its areas, and its working groups. Note that + other groups may also distribute working documents as Internet- + Drafts. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + The list of current Internet-Drafts can be accessed at + http://www.ietf.org/ietf/1id-abstracts.txt + + The list of Internet-Draft Shadow Directories can be accessed at + http://www.ietf.org/shadow.html. + + Copyright Notice + + Copyright (C) The Internet Society (2006). All Rights Reserved. + + + + + Abstract + + With a mandated default minimum maximum message size of 512 octets, + the DNS protocol presents some special problems for zones wishing to + expose a moderate or high number of authority servers (NS RRs). This + document explains the operational issues caused by, or related to + this response size limit, and suggests ways to optimize the use of + this limited space. Guidance is offered to DNS server implementors + and to DNS zone operators. + + + + + Expires January 2007 [Page 1] + + INTERNET-DRAFT August 2006 RESPSIZE + + + 1 - Introduction and Overview + + 1.1. The DNS standard (see [RFC1035 4.2.1]) limits message size to 512 + octets. Even though this limitation was due to the required minimum IP + reassembly limit for IPv4, it became a hard DNS protocol limit and is + not implicitly relaxed by changes in transport, for example to IPv6. + + 1.2. The EDNS0 protocol extension (see [RFC2671 2.3, 4.5]) permits + larger responses by mutual agreement of the requester and responder. + The 512 octet message size limit will remain in practical effect until + there is widespread deployment of EDNS0 in DNS resolvers on the + Internet. + + 1.3. Since DNS responses include a copy of the request, the space + available for response data is somewhat less than the full 512 octets. + Negative responses are quite small, but for positive and delegation + responses, every octet must be carefully and sparingly allocated. This + document specifically addresses delegation response sizes. + + 2 - Delegation Details + + 2.1. RELEVANT PROTOCOL ELEMENTS + + 2.1.1. A delegation response will include the following elements: + + Header Section: fixed length (12 octets) + Question Section: original query (name, class, type) + Answer Section: empty, or a CNAME/DNAME chain + Authority Section: NS RRset (nameserver names) + Additional Section: A and AAAA RRsets (nameserver addresses) + + 2.1.2. If the total response size exceeds 512 octets, and if the data + that does not fit was "required", then the TC bit will be set + (indicating truncation). This will usually cause the requester to retry + using TCP, depending on what information was desired and what + information was omitted. For example, truncation in the authority + section is of no interest to a stub resolver who only plans to consume + the answer section. If a retry using TCP is needed, the total cost of + the transaction is much higher. See [RFC1123 6.1.3.2] for details on + the requirement that UDP be attempted before falling back to TCP. + + 2.1.3. RRsets are never sent partially unless TC bit set to indicate + truncation. When TC bit is set, the final apparent RRset in the final + non-empty section must be considered "possibly damaged" (see [RFC1035 + 6.2], [RFC2181 9]). + + + + Expires January 2007 [Page 2] + + INTERNET-DRAFT August 2006 RESPSIZE + + + 2.1.4. With or without truncation, the glue present in the additional + data section should be considered "possibly incomplete", and requesters + should be prepared to re-query for any damaged or missing RRsets. Note + that truncation of the additional data section might not be signalled + via the TC bit since additional data is often optional (see discussion + in [RFC4472 B]). + + 2.1.5. DNS label compression allows a domain name to be instantiated + only once per DNS message, and then referenced with a two-octet + "pointer" from other locations in that same DNS message (see [RFC1035 + 4.1.4]). If all nameserver names in a message share a common parent + (for example, all ending in ".ROOT-SERVERS.NET"), then more space will + be available for incompressable data (such as nameserver addresses). + + 2.1.6. The query name can be as long as 255 octets of network data. In + this worst case scenario, the question section will be 259 octets in + size, which would leave only 240 octets for the authority and additional + sections (after deducting 12 octets for the fixed length header.) + + 2.2. ADVICE TO ZONE OWNERS + + 2.2.1. Average and maximum question section sizes can be predicted by + the zone owner, since they will know what names actually exist, and can + measure which ones are queried for most often. Note that if the zone + contains any wildcards, it is possible for maximum length queries to + require positive responses, but that it is reasonable to expect + truncation and TCP retry in that case. For cost and performance + reasons, the majority of requests should be satisfied without truncation + or TCP retry. + + 2.2.2. Some queries to non-existing names can be large, but this is not + a problem because negative responses need not contain any answer, + authority or additional records. See [RFC2308 2.1] for more information + about the format of negative responses. + + 2.2.3. The minimum useful number of name servers is two, for redundancy + (see [RFC1034 4.1]). A zone's name servers should be reachable by all + IP transport protocols (e.g., IPv4 and IPv6) in common use. + + 2.2.4. The best case is no truncation at all. This is because many + requesters will retry using TCP immediately, or will automatically re- + query for RRsets that are possibly truncated, without considering + whether the omitted data was actually necessary. + + + + + + Expires January 2007 [Page 3] + + INTERNET-DRAFT August 2006 RESPSIZE + + + 2.3. ADVICE TO SERVER IMPLEMENTORS + + 2.3.1. In case of multi-homed name servers, it is advantageous to + include an address record from each of several name servers before + including several address records for any one name server. If address + records for more than one transport (for example, A and AAAA) are + available, then it is advantageous to include records of both types + early on, before the message is full. + + 2.3.2. Each added NS RR for a zone will add 12 fixed octets (name, type, + class, ttl, and rdlen) plus 2 to 255 variable octets (for the NSDNAME). + Each A RR will require 16 octets, and each AAAA RR will require 28 + octets. + + 2.3.3. While DNS distinguishes between necessary and optional resource + records, this distinction is according to protocol elements necessary to + signify facts, and takes no official notice of protocol content + necessary to ensure correct operation. For example, a nameserver name + that is in or below the zone cut being described by a delegation is + "necessary content," since there is no way to reach that zone unless the + parent zone's delegation includes "glue records" describing that name + server's addresses. + + 2.3.4. It is also necessary to distinguish between "explicit truncation" + where a message could not contain enough records to convey its intended + meaning, and so the TC bit has been set, and "silent truncation", where + the message was not large enough to contain some records which were "not + required", and so the TC bit was not set. + + 2.3.5. A delegation response should prioritize glue records as follows. + + first + All glue RRsets for one name server whose name is in or below the + zone being delegated, or which has multiple address RRsets (currently + A and AAAA), or preferably both; + + second + Alternate between adding all glue RRsets for any name servers whose + names are in or below the zone being delegated, and all glue RRsets + for any name servers who have multiple address RRsets (currently A + and AAAA); + + thence + All other glue RRsets, in any order. + + + + + Expires January 2007 [Page 4] + + INTERNET-DRAFT August 2006 RESPSIZE + + + Whenever there are multiple candidates for a position in this priority + scheme, one should be chosen on a round-robin or fully random basis. + + The goal of this priority scheme is to offer "necessary" glue first, + avoiding silent truncation for this glue if possible. + + 2.3.6. If any "necessary content" is silently truncated, then it is + advisable that the TC bit be set in order to force a TCP retry, rather + than have the zone be unreachable. Note that a parent server's proper + response to a query for in-child glue or below-child glue is a referral + rather than an answer, and that this referral MUST be able to contain + the in-child or below-child glue, and that in outlying cases, only EDNS + or TCP will be large enough to contain that data. + + 3 - Analysis + + 3.1. An instrumented protocol trace of a best case delegation response + follows. Note that 13 servers are named, and 13 addresses are given. + This query was artificially designed to exactly reach the 512 octet + limit. + + ;; flags: qr rd; QUERY: 1, ANS: 0, AUTH: 13, ADDIT: 13 + ;; QUERY SECTION: + ;; [23456789.123456789.123456789.\ + 123456789.123456789.123456789.com A IN] ;; @80 + + ;; AUTHORITY SECTION: + com. 86400 NS E.GTLD-SERVERS.NET. ;; @112 + com. 86400 NS F.GTLD-SERVERS.NET. ;; @128 + com. 86400 NS G.GTLD-SERVERS.NET. ;; @144 + com. 86400 NS H.GTLD-SERVERS.NET. ;; @160 + com. 86400 NS I.GTLD-SERVERS.NET. ;; @176 + com. 86400 NS J.GTLD-SERVERS.NET. ;; @192 + com. 86400 NS K.GTLD-SERVERS.NET. ;; @208 + com. 86400 NS L.GTLD-SERVERS.NET. ;; @224 + com. 86400 NS M.GTLD-SERVERS.NET. ;; @240 + com. 86400 NS A.GTLD-SERVERS.NET. ;; @256 + com. 86400 NS B.GTLD-SERVERS.NET. ;; @272 + com. 86400 NS C.GTLD-SERVERS.NET. ;; @288 + com. 86400 NS D.GTLD-SERVERS.NET. ;; @304 + + + + + + + + + Expires January 2007 [Page 5] + + INTERNET-DRAFT August 2006 RESPSIZE + + + ;; ADDITIONAL SECTION: + A.GTLD-SERVERS.NET. 86400 A 192.5.6.30 ;; @320 + B.GTLD-SERVERS.NET. 86400 A 192.33.14.30 ;; @336 + C.GTLD-SERVERS.NET. 86400 A 192.26.92.30 ;; @352 + D.GTLD-SERVERS.NET. 86400 A 192.31.80.30 ;; @368 + E.GTLD-SERVERS.NET. 86400 A 192.12.94.30 ;; @384 + F.GTLD-SERVERS.NET. 86400 A 192.35.51.30 ;; @400 + G.GTLD-SERVERS.NET. 86400 A 192.42.93.30 ;; @416 + H.GTLD-SERVERS.NET. 86400 A 192.54.112.30 ;; @432 + I.GTLD-SERVERS.NET. 86400 A 192.43.172.30 ;; @448 + J.GTLD-SERVERS.NET. 86400 A 192.48.79.30 ;; @464 + K.GTLD-SERVERS.NET. 86400 A 192.52.178.30 ;; @480 + L.GTLD-SERVERS.NET. 86400 A 192.41.162.30 ;; @496 + M.GTLD-SERVERS.NET. 86400 A 192.55.83.30 ;; @512 + + ;; MSG SIZE sent: 80 rcvd: 512 + + 3.2. For longer query names, the number of address records supplied will + be lower. Furthermore, it is only by using a common parent name (which + is GTLD-SERVERS.NET in this example) that all 13 addresses are able to + fit, due to the use of DNS compression pointers in the last 12 + occurances of the parent domain name. The following output from a + response simulator demonstrates these properties. + + % perl respsize.pl a.dns.br b.dns.br c.dns.br d.dns.br + a.dns.br requires 10 bytes + b.dns.br requires 4 bytes + c.dns.br requires 4 bytes + d.dns.br requires 4 bytes + # of NS: 4 + For maximum size query (255 byte): + only A is considered: # of A is 4 (green) + A and AAAA are considered: # of A+AAAA is 3 (yellow) + preferred-glue A is assumed: # of A is 4, # of AAAA is 3 (yellow) + For average size query (64 byte): + only A is considered: # of A is 4 (green) + A and AAAA are considered: # of A+AAAA is 4 (green) + preferred-glue A is assumed: # of A is 4, # of AAAA is 4 (green) + + + + + + + + + + + Expires January 2007 [Page 6] + + INTERNET-DRAFT August 2006 RESPSIZE + + + % perl respsize.pl ns-ext.isc.org ns.psg.com ns.ripe.net ns.eu.int + ns-ext.isc.org requires 16 bytes + ns.psg.com requires 12 bytes + ns.ripe.net requires 13 bytes + ns.eu.int requires 11 bytes + # of NS: 4 + For maximum size query (255 byte): + only A is considered: # of A is 4 (green) + A and AAAA are considered: # of A+AAAA is 3 (yellow) + preferred-glue A is assumed: # of A is 4, # of AAAA is 2 (yellow) + For average size query (64 byte): + only A is considered: # of A is 4 (green) + A and AAAA are considered: # of A+AAAA is 4 (green) + preferred-glue A is assumed: # of A is 4, # of AAAA is 4 (green) + + (Note: The response simulator program is shown in Section 5.) + + Here we use the term "green" if all address records could fit, or + "yellow" if two or more could fit, or "orange" if only one could fit, or + "red" if no address record could fit. It's clear that without a common + parent for nameserver names, much space would be lost. For these + examples we use an average/common name size of 15 octets, befitting our + assumption of GTLD-SERVERS.NET as our common parent name. + + We're assuming a medium query name size of 64 since that is the typical + size seen in trace data at the time of this writing. If + Internationalized Domain Name (IDN) or any other technology which + results in larger query names be deployed significantly in advance of + EDNS, then new measurements and new estimates will have to be made. + + 4 - Conclusions + + 4.1. The current practice of giving all nameserver names a common parent + (such as GTLD-SERVERS.NET or ROOT-SERVERS.NET) saves space in DNS + responses and allows for more nameservers to be enumerated than would + otherwise be possible, since the common parent domain name only appears + once in a DNS message and is referred to via "compression pointers" + thereafter. + + 4.2. If all nameserver names for a zone share a common parent, then it + is operationally advisable to make all servers for the zone thus served + also be authoritative for the zone of that common parent. For example, + the root name servers (?.ROOT-SERVERS.NET) can answer authoritatively + for the ROOT-SERVERS.NET. This is to ensure that the zone's servers + always have the zone's nameservers' glue available when delegating, and + + + + Expires January 2007 [Page 7] + + INTERNET-DRAFT August 2006 RESPSIZE + + + will be able to respond with answers rather than referrals if a + requester who wants that glue comes back asking for it. In this case + the name server will likely be a "stealth server" -- authoritative but + unadvertised in the glue zone's NS RRset. See [RFC1996 2] for more + information about stealth servers. + + 4.3. Thirteen (13) is the effective maximum number of nameserver names + usable traditional (non-extended) DNS, assuming a common parent domain + name, and given that implicit referral response truncation is + undesirable in the average case. + + 4.4. Multi-homing of name servers within a protocol family is + inadvisable since the necessary glue RRsets (A or AAAA) are atomically + indivisible, and will be larger than a single resource record. Larger + RRsets are more likely to lead to or encounter truncation. + + 4.5. Multi-homing of name servers across protocol families is less + likely to lead to or encounter truncation, partly because multiprotocol + clients are more likely to speak EDNS which can use a larger response + size limit, and partly because the resource records (A and AAAA) are in + different RRsets and are therefore divisible from each other. + + 4.6. Name server names which are at or below the zone they serve are + more sensitive to referral response truncation, and glue records for + them should be considered "less optional" than other glue records, in + the assembly of referral responses. + + 4.7. If a zone is served by thirteen (13) name servers having a common + parent name (such as ?.ROOT-SERVERS.NET) and each such name server has a + single address record in some protocol family (e.g., an A RR), then all + thirteen name servers or any subset thereof could multi-home in a second + protocol family by adding a second address record (e.g., an AAAA RR) + without reducing the reachability of the zone thus served. + + 5 - Source Code + + #!/usr/bin/perl + # + # SYNOPSIS + # repsize.pl [ -z zone ] fqdn_ns1 fqdn_ns2 ... + # if all queries are assumed to have a same zone suffix, + # such as "jp" in JP TLD servers, specify it in -z option + # + use strict; + use Getopt::Std; + + + + Expires January 2007 [Page 8] + + INTERNET-DRAFT August 2006 RESPSIZE + + + my ($sz_msg) = (512); + my ($sz_header, $sz_ptr, $sz_rr_a, $sz_rr_aaaa) = (12, 2, 16, 28); + my ($sz_type, $sz_class, $sz_ttl, $sz_rdlen) = (2, 2, 4, 2); + my (%namedb, $name, $nssect, %opts, $optz); + my $n_ns = 0; + + getopt('z', %opts); + if (defined($opts{'z'})) { + server_name_len($opts{'z'}); # just register it + } + + foreach $name (@ARGV) { + my $len; + $n_ns++; + $len = server_name_len($name); + print "$name requires $len bytes\n"; + $nssect += $sz_ptr + $sz_type + $sz_class + $sz_ttl + + $sz_rdlen + $len; + } + print "# of NS: $n_ns\n"; + arsect(255, $nssect, $n_ns, "maximum"); + arsect(64, $nssect, $n_ns, "average"); + + sub server_name_len { + my ($name) = @_; + my (@labels, $len, $n, $suffix); + + $name =~ tr/A-Z/a-z/; + @labels = split(/\./, $name); + $len = length(join('.', @labels)) + 2; + for ($n = 0; $#labels >= 0; $n++, shift @labels) { + $suffix = join('.', @labels); + return length($name) - length($suffix) + $sz_ptr + if (defined($namedb{$suffix})); + $namedb{$suffix} = 1; + } + return $len; + } + + sub arsect { + my ($sz_query, $nssect, $n_ns, $cond) = @_; + my ($space, $n_a, $n_a_aaaa, $n_p_aaaa, $ansect); + $ansect = $sz_query + 1 + $sz_type + $sz_class; + $space = $sz_msg - $sz_header - $ansect - $nssect; + $n_a = atmost(int($space / $sz_rr_a), $n_ns); + + + + Expires January 2007 [Page 9] + + INTERNET-DRAFT August 2006 RESPSIZE + + + $n_a_aaaa = atmost(int($space + / ($sz_rr_a + $sz_rr_aaaa)), $n_ns); + $n_p_aaaa = atmost(int(($space - $sz_rr_a * $n_ns) + / $sz_rr_aaaa), $n_ns); + printf "For %s size query (%d byte):\n", $cond, $sz_query; + printf " only A is considered: "; + printf "# of A is %d (%s)\n", $n_a, &judge($n_a, $n_ns); + printf " A and AAAA are considered: "; + printf "# of A+AAAA is %d (%s)\n", + $n_a_aaaa, &judge($n_a_aaaa, $n_ns); + printf " preferred-glue A is assumed: "; + printf "# of A is %d, # of AAAA is %d (%s)\n", + $n_a, $n_p_aaaa, &judge($n_p_aaaa, $n_ns); + } + + sub judge { + my ($n, $n_ns) = @_; + return "green" if ($n >= $n_ns); + return "yellow" if ($n >= 2); + return "orange" if ($n == 1); + return "red"; + } + + sub atmost { + my ($a, $b) = @_; + return 0 if ($a < 0); + return $b if ($a > $b); + return $a; + } + + 6 - Security Considerations + + The recommendations contained in this document have no known security + implications. + + 7 - IANA Considerations + + This document does not call for changes or additions to any IANA + registry. + + 8 - Acknowledgement + + The authors thank Peter Koch, Rob Austein, Joe Abley, and Mark Andrews + for their valuable comments and suggestions. + + + + + Expires January 2007 [Page 10] + + INTERNET-DRAFT August 2006 RESPSIZE + + + This work was supported by the US National Science Foundation (research + grant SCI-0427144) and DNS-OARC. + + 9 - References + + [RFC1034] Mockapetris, P.V., "Domain names - Concepts and Facilities", + RFC1034, November 1987. + + [RFC1035] Mockapetris, P.V., "Domain names - Implementation and + Specification", RFC1035, November 1987. + + [RFC1123] Braden, R., Ed., "Requirements for Internet Hosts - + Application and Support", RFC1123, October 1989. + + [RFC1996] Vixie, P., "A Mechanism for Prompt Notification of Zone + Changes (DNS NOTIFY)", RFC1996, August 1996. + + [RFC2181] Elz, R., Bush, R., "Clarifications to the DNS Specification", + RFC2181, July 1997. + + [RFC2308] Andrews, M., "Negative Caching of DNS Queries (DNS NCACHE)", + RFC2308, March 1998. + + [RFC2671] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", RFC2671, + August 1999. + + [RFC4472] Durand, A., Ihren, J., Savola, P., "Operational Consideration + and Issues with IPV6 DNS", April 2006. + + 10 - Authors' Addresses + + Paul Vixie + Internet Systems Consortium, Inc. + 950 Charter Street + Redwood City, CA 94063 + +1 650 423 1301 + vixie@isc.org + + Akira Kato + University of Tokyo, Information Technology Center + 2-11-16 Yayoi Bunkyo + Tokyo 113-8658, JAPAN + +81 3 5841 2750 + kato@wide.ad.jp + + + + + Expires January 2007 [Page 11] + + INTERNET-DRAFT August 2006 RESPSIZE + + + Full Copyright Statement + + Copyright (C) The Internet Society (2006). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors retain + all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR + IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET + ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, + INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE + INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + + Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in this + document or the extent to which any license under such rights might or + might not be available; nor does it represent that it has made any + independent effort to identify any such rights. Information on the + procedures with respect to rights in RFC documents can be found in BCP + 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an attempt + made to obtain a general license or permission for the use of such + proprietary rights by implementers or users of this specification can be + obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary rights + that may cover technology that may be required to implement this + standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + + Acknowledgement + + Funding for the RFC Editor function is provided by the IETF + Administrative Support Activity (IASA). + + + + + Expires January 2007 [Page 12] + + |