Project 4

 

Introduction

In this project, you are given a file with list of random websites. You will run the “tracert” command to these websites from linprog servers and analyze the output. The goal is to analyze the Internet and find properties such as the distribution of path length, the distribution of path delay, etc. A byproduct is to practice some simple scripting language to analyze data such as a log file.

The file with the list of the website will be given. It is obtained from http://www.metafilter.com/links.mefi. It is supposed to be purely random and any link appeared in this list is not reflecting any preference of me.

You can run command such as traceroute www.google.comand the output should contain the information of the number of hops traveled, the name of the routers visited, and the delay to each router. You can use the provided perl script to automatically loop though the websites and write the output to a log file.

 

Requirements

You should write code, in any language of your preference, to analyze the log file. You should get

A.   (25 points) The average number of hops (number of routers visited) from a linprog server to the websites.

B.   (25 points) The average delay from a linprog server to the websites, measured in ms.

C.   (25 points) The average link delay between two adjacent routers on the path, measured in ms. Use only the positive delays as some delay measurements may be negative. 

D.   (25 points) The distribution (histogram) of the number of hops (number of routers visited) from a linprog server to the websites.

E.    (Extra 10 points) Providing a figure to show the histogram. The figure can be generated with any software you prefer, Matlab, gnuplot, Excel, etc.

 

Input/output of the executable

You code should be compliable in linprog servers into a single executable.

1.     It should take one argument which is the name of the log file to analyze.

2.     It should produce two output files:

·        A file named “avg containing the answers to requirements A, B, and C, in the format of

Avg number of hops: 11.5

Avg delay: 20.9 ms

Avg link delay: 3.9 ms

·        A file named “histocontaining the histogram of number of hops, where each row has two numbers, the first the number of hops, the second the number of websites that takes this number of hops, such as:

1 0

2 0

3 0

4 0

5 0

6 0

7 0

8 2

9 6

Submission

You should submit the following files in a tar file:

1.     The source file(s) that you used to analyze the log file.

2.     A README file that explains how to compile your source file(s) into the executable if needed.  

3.     The figure of the histogram generated according to the posted log file, if you have it.

 

Resource

Here is the tar ball with two files needed in this project, the list of the websites and the perl script. Here is the log file I collected in my machine which can be used when developing your code. The results may look like:

·        avg

·        histo

There has been some confusion on the link delay. This is the processing result of my code; if you cannot find problems in it by April 25, 2012, we will grade with my results for grading. Extra points will be given to the first to report the problem, if any.