COMPUTER AND NETWORK
                         SYSTEM  ADMINISTRATION
                         Summer 1996 - Lesson 15
                           Introduction to DNS
A. Introduction - The Domain Name Service (DNS)
   1. host name to IP number mapping was originally done
      by downloading a static file
   2. the UNIX version of this file is /etc/hosts
   3. the central file was maintained by the Stanford Research
      Institute Network Information Center (SRI-NIC)
   4. as the Internet grew this scheme became unworkable
      - the size of the file became too large
      - the load on SRI-NIC site became too heavy
      - the file was always inconsistent with reality
      - hostname collisions became frequent (anyone could
        name their machine "whitehouse.gov" if they wanted to)
** Figure 1.1
A. Overview
  1. In 1984 Paulk Mockapetris of USC designed the architecture
     of DNS
  2. the Domain Name Service is essentially a distributed database 
     managed by InterNIC.
  3. Features
     - local control: each segment is updated locally
     - global access: each segment is available (almost) immediately 
       to the rest of the world upon update
     - robustness: achieved through replication
     - adequate performance: is achieved through caching
  4. software
     - servers: called name servers, contain information about
                some segment of the network and make it available to 
                clients ("BIND" = "Berkely Internet Name Daemon",
		includes "named", libraries, "nslookup", "dig", "host")
     - client:  resolvers, a set of library routines that
                resolve names by accessing a server (originally
		a separate library,  like libresolv.a, now
		usually part of libc.a)
C. Domain structure 
** Figure 1.2
   - similar to the structure of a hierarchical file system
   - the root's name is the null label " " but is written
     as a single dot "."
   - each node represents a 'domain'
   - every domain is named
   - the full domain name is the sequence of labels from the 
     domain to the root, separated by periods
   - unlike a file system pathname the name is read from leaf
     to root (right to left rather than left to right)
		delta.cs.fsu.edu
** Figure 1.3
D. Domain management
   - each domain may be managed by a different organization
   - the organization may divide itself into subdomains
   - then delegate responsbility for maintaining them
   - NIC manages the top-level domains
       
   - example: NIC delegates the "fsu.edu" domain to ACNS
** Figure 1.4
E. Host names
   1. each host on a network has a domain
   2. the domain points to information about the host
   3. this may include:
      -  an IP address
      -  mail routing information (different than for other
         services)
      - aliases which point to the real ("canonical") host name
** Figure 1.5
F. Name collisions
   1. the possibility for name collisions is now greatly
      reduced
** Figure 2.1
G. The domain name space
   - there may be any number of branches at a node
   - BINDs implementation limits the tree's depth
     to 127
 "davidsun.kuncicky.faculty.cs.fsu.edu.us.northamerica.earth.solarsystem.milky way....."
   - each name may contain up to 63 characters
   - the suggested length is < 12
   - a domain name that is written relative to the root is
     called a 'fully-qualified domain name' - FQDN
   - names without trailing dots ("leading dots") are sometimes 
     interpreted as relative to some domain other than root
** Figure 2.2
   - sibling nodes must have unique names
** Figure 2.3, 2.4
   - the name of a domain is the domain name of the node at the
     top of the domain (example purdue.edu)
   - again, similar to a file system
** Figure 2.5
   - a node is in multiple domains
   - so, a domain is just a subtree of the domain name space ("subdomain")
H. Hosts
   - where are the hosts?
   - a domain name is just an index into the DNS database
   - the 'hosts' are domain names that point to individual
     machine information
   - the hosts are realated 'logically' usually by geography
     or organization
   - they are NOT necessarily related by network or IP address
     or hardware type
   - you could have 10 different hosts on 10 different networks
     in ten different countries all in the same domain (hp.com)
   - nodes at the leaves of the tree usually represent individual hosts
** Figure 2.6
   - interior nodes may point to both:
     > host information
     > subdomain information
   - for example, "hp.com" is both the name of a domain and the name
     of a machine that routes mail
I. The Domain name space
  1. terms:
     top-level domain: a child of root (edu)
     first-level domain: a child of root (edu)
     second-level domain: a child of 1st level domain
        (fsu.edu)
  2. naming rules
   - there are not many rules imposed on the naming of domains
   - there are some traditions at the top-level
   - the original 7 top-level domains are:
     com  - commerical organizations
     edu  - educational organizations
     gov  - government bodies
     mil  - military organizations
     net  - networking organizations (nsf.net)
     org  - non-commercial organizations
     int  - international organizations
   - what about the rest of the world?
     
   - the Internet began as ARPANET, funded by Defense
     Advanced Research Projects (the military)
   - later was funded by the National Science Foundation
   - no one anticipated the international success of the Internet
  3. international names
   - 2-letter designations are reserved for each country
   - ex:  DE - Germany
          DK - Denmark
    
   - each country may organize its domain space however it wishes
     ex:  Australia uses:  edu.au and com.au
     ex:  Britain uses:    co.uk - corporations
                           ac.uk - academic community
     ex:  USA uses states: fl.us
              then cities: tlh.fl.us  
J. Name servers
  1. zones
   - a program that stores information about the domain name space
     is called a Domain Name Server
   - a name server generally has complete information about some part
     of the domain name space
   - the subspace is called a 'zone'
   - the server is said to have 'authority' for one or more zones
   - what is the difference between a zone and a domain?
     [ A domain may be composed of one or more zones, but not vice versa ]
** Figure 2.7
** Figure 2.8
** Figure 2.9
  2. types of name servers
   - primary master
     > gets the data for its zones from flat data (text) files
   - secondary master
     > gets the data for its zone from another server
     > it periodically updates its local data by copying the
       primary master's files
     > this is called a 'zone transfer'
   - generally keep more than one name server for any given zone
     > redundancy: fault tolerance
     > load: localize it as much as possible
K. Resolvers
  1. name service clients
   - these are the clients that access name servers
   - in BIND these are a set of library routines
   - these are compiled into telnet, ftp, etc. so that
     these programs will use DNS to resolve names ("gethostbyname()" and others)
  2.  duties of a simple resolver:
   - this is called a 'stub resolver'
   - querying a name server
   - interpreting the response
   - resend a response 
   - returning a reply to the program that it is servicing (ftp) 
            
L. Resolution
  1. how does the name server resolve names
   - if the name is in the name server's zone then it can give
     the resolver an immediate 'authoritative' response
   - if not, then the name server must search the domain name space
     for an answer
   - it only needs one piece of information to get started: the location
     of a root-level server
 
  2. root name servers
   - the root name servers are authoritative for the top-level domains
     (edu, org, us, dk, etc.)
   - they can point you to the name servers for each of the top-level
     domains
   - they, in turn can point you to their subdomains, etc. until the
     name is resolved
   - this scheme puts a lot of importance on the root-level servers
** Figure 2.10
   - example of name resolution via "nslookup"
	nslookup
	> set debug
	> ftp.es.net
Server:  dns.scri.fsu.edu
Address:  144.174.128.17
------------
Got answer:
    HEADER:
        opcode = QUERY, id = 3, rcode = NXDOMAIN
        header flags:  response, authentic answer, want recursion, recursion available
        questions = 1,  answers = 0,  authority records = 1,  additional = 0
    QUESTIONS:
        ftp.es.net.scri.fsu.edu, type = A, class = IN
    AUTHORITY RECORDS:
    ->  scri.fsu.edu
        ttl = 86400 (1 day)
        origin = dns.scri.fsu.edu
        mail address = hays.mailer.scri.fsu.edu
        serial = 1996061302
        refresh = 3600 (1 hour)
        retry   = 900 (15 mins)
        expire  = 3600000 (41 days 16 hours)
        minimum ttl = 86400 (1 day)
------------
------------
Got answer:
    HEADER:
        opcode = QUERY, id = 4, rcode = NOERROR
        header flags:  response, want recursion, recursion available
        questions = 1,  answers = 2,  authority records = 4,  additional = 6
    QUESTIONS:
        ftp.es.net, type = A, class = IN
    ANSWERS:
    ->  ftp.es.net
        canonical name = nic3.es.net
        ttl = 3556 (59 mins 16 secs)
    ->  nic3.es.net
        internet address = 198.128.3.83
        ttl = 3600 (1 hour)
    AUTHORITY RECORDS:
    ->  ES.net
        nameserver = NS1.es.net
        ttl = 168588 (1 day 22 hours 49 mins 48 secs)
    ->  ES.net
        nameserver = DNS-EAST.es.net
        ttl = 168588 (1 day 22 hours 49 mins 48 secs)
    ->  ES.net
        nameserver = DNS-WEST.NERSC.GOV
        ttl = 168588 (1 day 22 hours 49 mins 48 secs)
    ->  ES.net
        nameserver = IGW.NERSC.GOV
        ttl = 168588 (1 day 22 hours 49 mins 48 secs)
    ADDITIONAL RECORDS:
    ->  NS1.es.net
        internet address = 198.128.2.10
        ttl = 168588 (1 day 22 hours 49 mins 48 secs)
    ->  DNS-EAST.es.net
        internet address = 134.55.6.130
        ttl = 168588 (1 day 22 hours 49 mins 48 secs)
    ->  DNS-WEST.NERSC.GOV
        internet address = 128.55.128.191
        ttl = 168588 (1 day 22 hours 49 mins 48 secs)
    ->  DNS-WEST.NERSC.GOV
        internet address = 128.55.32.12
        ttl = 168588 (1 day 22 hours 49 mins 48 secs)
    ->  IGW.NERSC.GOV
        internet address = 128.55.32.9
        ttl = 3600 (1 hour)
    ->  IGW.NERSC.GOV
        internet address = 128.55.128.151
        ttl = 3600 (1 hour)
------------
Non-authoritative answer:
Name:    nic3.es.net
Address:  198.128.3.83
Aliases:  ftp.es.net
  
  3. recursion
   - note that the first name server made multiple requests
   - the others simply referred the first server to another machine
   - the local server was responding to a 'recursive query'
   - this is because the local 'dumb' resolver is not smart enough to
     follow any referrals
   - a recursive query places most of the work on a single name server
   - when a recusive query is made the name server is obliged to go find the
     answer or return an error message
  4. iteration
   - iterative resolution is the method used when a name server
     receives an 'iterative query'
   - in this case, if the server does not have the authoritative
     answer it returns:
     > a cached answer (we'll talk about this in a minute)
     > the names and addresses of several servers that are closer
       to the answer
** Figure 2.11
	nslookup
	> server ns1.es.net
	> ftp.es.net
Server:  ns1.es.net
Address:  198.128.2.10
Name:    nic3.es.net
Address:  198.128.3.83
Aliases:  ftp.es.net
M. Mapping addresses to names
  1. what if you have an IP number and want to find the host name?
     - this is usefule to make output more readable
     - used for some security checks
  2. this was easy with the old /etc/hosts tables
  3. the DNS data is indexed by name
     - could do an exhuastive search
  4. the clever solution
     - create a part of the domain name space that uses
       addresses as names
     - this is: in-addr.arpa
** Figure 2.12
     - for example type:
       nslookup
       >set type=any
       >10.121.186.128.in-addr.arpa
       returns:
       nu.cs.fsu.edu
       As we saw earlier, though, newer "nslookup" versions will automatically
       build the reverse name lookup (like on Linux):
       nslookup
       > 128.186.121.10
       Server:  dns.scri.fsu.edu
       Address:  144.174.128.17
       Name:    nu.cs.fsu.edu
       Address:  128.186.121.10
N. Caching
   1. each time a local name server processes a recursive query
      it learns a lot of information
   2. this is cached which speeds up successive queries
** Figure 2.14
   3. example:  
      - say our server has already looked up the address of
        eecs.berkeley.edu
      - this means it has cached the name servers for both:
        eecs.berkeley.edu
        berkeley.edu
      - if we now make a query for baobab.cs.berkeley.edu
        the local server can skip the root-level query and
        go right to berkeley.edu
   4. time to live (TTL)
      - TTL is the amount of time that information is cached
        before it is discarded
      - the trade-off is between consistency and performance