There are two types of best practices for operators of authoritative servers for critical zones: DNS security, and DNS availability and resilience. In addition to these two categories specific to the core DNS, all operators must pay careful attention to practices related to hardening their core system security.

Practice 1

Authoritative zones MUST be DNSSEC signed and best practices for key management MUST be followed.

Signing a zone using DNSSEC authenticates the origin of the data and protects the data’s integrity. DNSSEC is built on top of public key cryptography. Zone owners use a private key to create digital signatures of the DNS data, and publish it in the zone alongside the data. Also published in the zone is the public key used by DNSSEC-enabled resolvers to verify (or validate) the signatures. If the signature matches the data published, then the data hasn’t been corrupted or tampered with on its path from the authoritative servers for the zone, through one or more resolvers, to the final client. The keys used to sign the data are in turn signed by the operators of the parent zone, all the way to the root of the DNS.

Without DNSSEC, DNS data is somewhat static, and days, weeks or even months can go by without updates in a zone. Once DNSSEC is deployed, signatures need to be refreshed regularly because they expire. In addition, the cryptographic key material (public and private keys) used to sign and validate the data must be rotated (“key rollover”) on a regular schedule to reduce the risk of exposure of key material, and to practice the process of an emergency key replacement (which is similar to a normal key rollover).

The above processes require proper monitoring of the DNSSEC signing process, the resulting zone, the expiration of the DNSSEC signatures, and proper handling of key materials. To achieve this, it is strongly suggested that the DNS operator responsible for managing the DNSSEC process put together a DPS, or DNSSEC Practice Statement, as defined in RFC 6841.

ccTLD operators should check the ICANN DNSSEC guidebook for ccTLDs for more information.

Practice 2

Access to zone transfer between authoritative servers MUST be limited. Configure ACLs and TSIG in the DNS Authoritative software package to restrict zone transfers to secondary servers only.

Third parties who are involved in providing secondary service for a zone have no reason to request the entire content of a zone. While DNS data is public, there are some security risks associated with unauthorized parties having access to an entire zone’s content. Also, large zones can cause significant amounts of traffic if zone transfers are repeated multiple times.

Practice 3

Zone file integrity MUST be controlled to avoid unexpected modifications (malicious or accidental).

In the event of data corruption or suspected security breach leading to unauthorized modification of DNS data, a mechanism should be implemented to detect such changes and when they occurred.

Practice 4

Authoritative and recursive DNS service MUST NOT coexist on the same DNS server. In the context of authoritative servers, this means you MUST disable recursive DNS resolution on servers configured to serve authoritative DNS data (if the software allows running both authoritative and recursive at the same time).

DNS software packages such as ISC BIND can be configured to function as authoritative and recursive resolution on the same installation. Historically, this was to avoid the cost and overhead of two separate servers, particularly before virtualization was common. This is still the default behavior in most Windows DNS deployments as well. Clients in Active Directory enabled environments are typically configured to send recursive DNS lookups to DNS servers that by default serve a copy of the organization’s DNS zone. Queries for other domain names are resolved as usual (either directly, or by forwarding queries to an external server).

While this configuration does allow for a simpler setup (and faster lookup times) when looking up names within an organization’s own domain/zone, there are other problems with this configuration, including the risk for those DNS servers answering queries for “stale” domains that have since been delegated to other nameservers. This is a risk when the DNS servers are publicly reachable from the Internet.

Practice 5

At least two distinct nameservers MUST be used for any given zone. Note that this is usually a requirement when registering domain names in most TLDs  (gTLD, ccTLD, …).

If the equipment, network or location where your DNS servers are located suffers an outage (software, hardware, network, etc.), this shouldn’t affect the ability for the rest of the Internet to look up data in the affected domains, even if part or all of the servers and services referenced in the affected domains may be unreachable. Some software may behave in unpredictable ways or cause unnecessary timeouts as they attempt to look up information from unreachable DNS servers.

It may be tempting to increase reliability using a load balancer in front of multiple servers, but that usually isn’t practical because it doesn’t easily allow for geographical diversity, introduces complexity, and risks overloading stateful systems in case of D/DoS type traffic patterns. Also, in case of a failure of the load balancing system, all DNS services placed behind it become unavailable simultaneously.

Practice 6

There MUST be diversity in the authoritative operations to promote resilience. This MUST cover one or more of the practices below:

  1. Software Diversity: For a given zone, make sure all published nameservers aren’t running the same authoritative DNS software package and version.
  2. Network Diversity: For a given zone, make sure all authoritative servers are not placed within the same Autonomous System (AS) or within the same subnet.
  3. Geographical Diversity: For a given zone, make sure all the authoritative servers are in different physical locations (not the same rack and room or city, region, or country).

In the event of a vulnerability affecting a DNS software package or an operational issue occurring (e.g., network outage, power outage), there is a risk that all of one’s DNS servers could become unavailable at the same time if the servers operational practices and location are not sufficiently diversified. This is especially true for critical zone operators such as TLDs where the DNS is the service, and everything must be done to maintain service in the case of an outage to avoid major downtime for a large part of the Internet in a country or region.

Practice 7

Monitoring of the services, servers, and network equipment that make up your DNS infrastructure MUST be implemented.

Monitoring of your DNS service is critical to ensure that it is available to users and customers. This can be achieved through local monitoring (hosted on premises) or a remote location, and either managed by yourself or a third party (outsourced/cloud based).