How lame are our reverse delegations?

Print

Introduction

lameAFRINIC manages reverse delegations (RDNS) for the IPv4 and IPv6 address space allocated by IANA to AFRINIC.

When resources are issued to network operators, AFRINIC delegates the authority of reverse zones to those operators, who in turn have to publish the name servers of their reverse zones in AFRINIC managed zones. As you know, RDNS allow applications on the Internet to map an IP to a host and is considered to be an important mechanism used by many applications on the Internet for example by mail servers in the prevention of spam. It is therefore very important to have accurate data about reverse domains in the AFRINIC WHOIS database as those misconfigurations can have multiple adverse impact on the robustness of the DNS.

AFRINIC has around 30000 domain objects with 72000+ NS records with more than 45% of lame NS records in IPv4 zones and 32% of lame NS records IPv6. This constitutes around 25% of all domain objects being lame.

 

When is your delegation considered "lame" ?

RFC1912 defines a delegation to be ‘lame’ when a nameserver is delegated the responsibility for providing a nameservice for a zone (via NS records) but it is not actually doing it i.e. the nameserver is either not set up as a either a primary or secondary server. However, we classified ‘lameness’ in four different cases for more granularity.

Cases of lame delegation

A delegation (rdns) is considered "lame" if one of the following rules is met:

 

RULE# Reason Description Example Error message
1 A nameserver is not reachable

nameserver is non responsive

dig +norec aaa.bbb.ccc @toto.afrinic.net A
dig: couldn't get address for 'toto.afrinic.net': not found
or
dig +norec @1.1.1.1 afrinic.net a ; <<>> DiG 9.8.3-P1 <<>> +norec @1.1.1.1 afrinic.net a
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
nameserver <nameserver> is not responsive
2 Nameserver not responding on port 53

nameserver in the domain object is reachable

but does not respond on port 53

dig +norec @196.216.2.6 afrinic.net a

; <<>> DiG 9.8.3-P1 <<>> +norec @196.216.2.6 afrinic.net a
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
Nameserver <nameserver> does not answer on port 53
3 Nameserver does not serve the domain

nameserver in the domain object is reachable

responds on port 53

but does not answer this domain

dig +norec @ns1.apnic.net afrinic.net. NS

; <<>> DiG 9.8.3-P1 <<>> +norec @ns1.apnic.net afrinic.net. NS
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 15421
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;afrinic.net. IN NS

;; Query time: 536 msec
;; SERVER: 2001:dc0:2001:0:4608::25#53(2001:dc0:2001:0:4608::25)
;; WHEN: Fri Oct 3 11:38:17 2014
;; MSG SIZE rcvd: 29
Nameserver <nameserver> does not server this domain <zone>
4 Nameserver is not authoritative

nameserver in the domain object is reachable

responds on port 53

returns a valid DNS response

authority bit ('aa' flag) is not set

dig +norec @8.8.8.8 afrinic.net. NS

; <<>> DiG 9.8.3-P1 <<>> +norec @8.8.8.8 afrinic.net. NS
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60812
;; flags: qr ra; QUERY: 1, ANSWER: 7, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;afrinic.net. IN NS

;; ANSWER SECTION:
afrinic.net. 3563 IN NS ns2.afrinic.net.
afrinic.net. 3563 IN NS sec1.apnic.net.
afrinic.net. 3563 IN NS tinnie.arin.net.
afrinic.net. 3563 IN NS ns1.afrinic.net.
afrinic.net. 3563 IN NS sec1.authdns.ripe.net.
afrinic.net. 3563 IN NS sec3.apnic.net.
afrinic.net. 3563 IN NS ns2.lacnic.net.

;; Query time: 219 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Oct 3 11:35:41 2014
;; MSG SIZE rcvd: 192

Nameserver <nameserver> is not authoritative for zone <zone>

 

Table 1. Lame case scenarios 

We took the whole set of reverse domain and run the experiment against each domain and nameserver Resource Record (NS) tuple. A domain can have multiple NS records. Each record is considered as an entry in DNS for which we have verified its validity.

Type Number of domain objects Number of NS records
IPv4 29894 72341
IPv6 196 550
Total 29986 72891

Table 2. AFRINIC domain objects

We run the experiment from two different locations (Mauritius and Johannesburg). A delegation is considered ‘lame’ if it fails on both sites. To simplify the representation of the results we decided to classify the domains in 3 categories:

Category Description
CASE_0
  • NS is responsive
  • NS serves the domain
  • NS is authoritative i.e AA flag present
CASE_1
  • Connection timed out
  • Name or service not known
  • Connection refused
  • Network unreachable
  • Host unreachable
  • End of file
  • Communications error
  • Couldn't get address
CASE_2
  • Response status is REFUSED or SERVFAIL
  • No answer received from server i.e ANSWER: 0
CASE_3
  • NS is not authoritative

Table 3. Status of the domain

Using the dig command, we classified the result of each delegation found on the public reverse zones of AFRINIC, as per the criteria in Table 3. We also have different sub cases for e.g. CASE_3 can be tagged as REFUSED. SERVFAIL, NXDOMAIN or NO_ANSWER depending on the type of error being returned in the result. 

20161019010742

Figure 1. Error distribution by protocol version

For the 45.5% of RDNS entries that were found as “problematic”, it means that at least one of the NS records in the domain object is “LAME”. We will analyse the objects based on the different case scenarios in Table 4.

Type OK % NOK % Total
IPv4 39439 54.5 32970 45.5 72409
IPv6 369 68 174 32 543

Table 4. Percentage of lame vs non-lame NS records

Error Breakdown per resource type

Table 5. shows the distribution of errors i.e. how those 45.5% of lame delegations can be classified. We observed that 75.5% are actually CASE_3 (responsive servers but not serving the zone). Most probably, the nameservers that were registered are have been decommissioned. 23.5% of errors are CASE_1 and CASE_2, meaning that the servers are not even reachable.

 Type  Case  1  Case 2 Case 3 Total
 JNB  MRU  JNB MRU JNB MRU  
 IPv4  7933  7674  24965 24918 298 331 32970
 IPv6  18  20  155 155 0 0 174
Total/Mean 7822 25096 314 33144
Avg.percentage 23.5% 75.5% 1%  

Table 5. Breakdown per resource type

Error Breakdown per address block

We notice from table 6 and Fig 4 that CASE_4 is not really an issue for any of the address block managed by AFRINIC. However, we is a high percentage of CASE_3 (more than 40%) on 197/8 and 154/8 as well as on the 2001:4200::/23 IPv6 block.

Address Block NOT LAME CASE 0 Case 1 Case 2 Case 3
# % # % # % # %
196/8 6599 47.7 2204 16.0 4953 35.8 91 0.7
197/8 5855 35.4 699 4.2 9908 60.0 58 0.4
154/8 1599 41.0 458 11.7 1821 46.7 25 0.6
41/8 15692 63.1 2850 11.5 6238 25.1 104 0.4
102/8 7 100 0 0 0 0 0 0
105/8 431 43.7 31.9 239 239 24.0 2 0.2
Various[*] 9122 74.4 11.0 1753 1753 14.3 33 0.3
2001:4200::/23 106 52.3 4.5 87 87 43.1 0 0.0
2c00::/12 263 77.1 2.9 68 68 20.0 0 0.0

Table 6. Breakdown per address block

 resource breakdown

Figure 2. Error breakdown per address block

Future work

Lame delegation is only a subset of DNS misconguration. To ensure full availability, name servers should be truly redundant. By truly redundant, we mean that primary and secondary name servers should be geographically spread and not found on the same host and as far as possible, not on the same network (i.e. on different ASes). In the event of a routing outage and one network is unavailable, the other network would still be reachable. This ensures full redundancy. Furthermore, it would be interesting to see where African network operators are hosting their DNS servers. Mapping the servers by location would give us an indication whether African operators are using local or offshore services, usually reachable on expensive international links. Cyclic zone dependency is another issue that is less known but yet important to tackle as they create dependency loops between DNS servers. The impact is the addition of unnecessary load on those servers ultimately affecting availability on the overall. 

Tags: