This program requires a call to the acos() standard function. In order to have access to this function, you should put "use Math::Trig;" at the top of the program. Start with checking if some file arguments have been supplied. If this is not the case, generate an error message and stop the program (with die()). Otherwise, apply the program for exercise 6.3 to the files by calling the function system(). The arguments of the function should be the same as what you normally type in at the command line: "perl" followed by "-w" and "63.pl", all separated by commas. The final argument should be the file list @ARGV. Next, open the tf-idf file for the first argument (generated by 63.pl), read its contents line by line and store it in a hash %tfidf. Do not process the input file $file but its tf-idf file "$file.tfidf". The hash will contain a vector and we also need the length of the vector. You might as well start computing this length in the loop that processes the lines of the input file. The length of vector (x,y,z) is equal to the square root (sqrt()) of x*x+y*y+z*z. Define a variable $length before opening the file and in the line reading loop add $t*$t to this variable where $t represents the tf-idf score associated with a word. When the complete file has been processed, compute the square root (sqrt()) of $length. After this, start a loop over all remaining files. For each file, the file will be opened, its lines will be read and two values will be adjusted: the sum of the products of the corresponding values in this file and the first file, and the length of the vector in the current file. Define two variables $product and $length2 before opening the new file. While reading the file, add $t*$tfidf{$w} to $product if the $tfidf{$w} exists ($t is the tf-idf score of $w in the current file) and add $t*$t to $length2. When all the lines have been processed, compute the actual length of the current vector by taking the square root of $length2. Then you can compute the similarity of the current file and the first file by applying acos() to $product divided by the two lengths. Print the result of this computation together with the name of the current file and continue with processing the next file. In the output, there will be two extreme values: 0.000 means that the current file contains exactly the same words as the first file (the order of the words does not matter) while 1.571 (pi/2) means that the two files do not share any words. The lower the score received by a file, the more similar it is to the first file. Here are the results of an example run with three files as well as the files that were involved. You can try running your program on the same files and check if you obtain the same results. erikt@stuwww:~ perl -w 64.pl lim1.txt lim2.txt lim3.txt 1.552 lim2.txt 1.547 lim3.txt erikt@stuwww:~ cat lim1.txt There was a young man from Japan Whose limericks never would scan. When asked why this was, He answered 'because I always try to fit as many syllables into the last line as ever possibly I can.' erikt@stuwww:~ cat lim2.txt There once was a man from the sticks Who liked to compose limericks. But he failed at the sport, For he wrote 'em too short. erikt@stuwww:~ cat lim3.txt There was an old man of St. Bees, Who was stung in the arm by a wasp; When they asked, "Does it hurt?" He replied, "No, it does n't, But I thought all the while 't was a Hornet!"