Project 1: Word/letter usage statistics
Due: January 23, 2004
Educational Objectives:
Experience text processing techniques;
experience using makefiles to organize and compile
applications; experience using namespaces.
Statement of Work: Implement a program that collects the
statistics of word/letter usage in a file (redirected as the
standard input). A word is defined as a sequence of letters or numbers.
Words are separated by non-letter and non-digit characters. For example
'aaa111:111bbb' contains two words 'aaa111' and '111bbb'. Your program should
record the number of times each word/letter happens in the file.
Deliverables: Turn in files program1.cpp
and makefile using the project1submit.sh
script. You can copy project1submit.sh
script from scripts/project1submit.sh.
Requirements:
-
Create a subdirectory called proj1.
At this point you should have six subdirectories of cop4530:
cpp,
tcpp,
tests,
examples,
partials,
and proj1.
Make sure your code distribution directories are up to date by invoking your
"update'' command.
-
For this project you need to create two files: program1.cpp,
and makefile.
Both files should be placed in the proj1 directory.
-
The file program1.cpp should contain the main function,
int main(). In the main() function, the program should read the
input until it reaches the end, counting the number of times each word/letter is used.
The program should then output the ten most used
letters and the ten most used words as well as the number of times these letters/words are used.
The letters and words should be outputed in the descending order based on the number of times
they are used in the file. When two letters happen the same number of times in the
file, the letter with a smaller ASCII value should be considered as being used more
frequently. When two words happen the same number of times, the word that occurs
earlier in the file should be considered as being used more frequently.
An example executable (for the program machines) 'proj1.x' is given in the area51 directory.
You should make the outputs of your program the same as those of 'proj1.x'. When printing letters,
use '\t' for tab and '\n' for newline. All other letters should be outputted normally.
- Write a makefile for your project that compiles
an executable called program1.x
- You are not allowed to use any C++ container class: You must maintain a simple
container (to record words and their corresponding counters) from scratch.
Hints:
-
You should copy the file examples/makefile.eg into your
proj1 directory, and then modify it to make it work for the project.
Extra credits:
- An extra of 5 points will be awarded for each bug found in 'proj1.x'.