You may work with a partner (of your choosing) for this assignment, or you may work alone if you prefer (but don't expect me or one of the assistants to act as a surrogate partner for you). Your partner (if you choose to accept one) must be from your section. If you wish to work with a partner but do not find someone to work with on your own, let me know; if someone else is also looking, I will help you to get connected. I highly recommend finding someone with similar abilities, commitment to the course, etc.
Some of the later assignments will leave more of the design decisions to you. But it is very important to get practice implementing other people's specifications and designs.
Note that it is quite possible to do this assignment incrementally. Each of the required classes may be implemented and initially tested independent of the others.
Most of the credit for this assignment will be based on correctness. But quality counts. In particular, you should try to avoid writing the same code in multiple placesw if it is easy to avoid doing so.
I have provided an abstract class called WordListPrinter
that declares a constructor
public
WordListPrinter(BufferedReader _in, PrintStream _out)
and a single
public method:
public abstract void outputWordList();
When this method is called, it reads all of the words from the input (the
one provided as the first argument to the constructor), and prints to the output
(the one provided as the second argument of the constructor) a sorted list of
these words (with no duplicate words, but note that "sleep" and "sleep." are two
different words). The list should be sorted according to the normal ASCII
ordering provided by String's compareTo method.
For additional information on the WordListPrinter class, see the Javadoc documentation.
In this abstract class, I have provided the concrete method iterateAndOutput that your outputWordList method should call to actually produce the output (this should ensure that everyone's output is identical). Your outputWordList method should not produce any output other than that generated by its call to iterateAndOutput.
You will experiment with using different data structures and algorithms in four different extensions of WordListPrinter:
Finally, you will experiment with supplying different PrintStream objects (some of which you will create) as the second argument of outputWordList, in order to format the output in different ways.
All of the classes that you create should be in the default package. I.e., do not include a package declaration anywhere in your code.
You can use the approach of lines 23-24 of DS page 42 to read the input a line at a time, and then create a StringTokenizer (with space and tab as its delimiters) to process each line a word at a time.
You may want to define one or more additional methods in the abstract class, in order to avoid having to write identical code in each extension class.
In your ArrayListWordListPrinter class, simply add each word to the
end of the theList ArrayList if that word does not duplicate a
word that it already there. When all of the words are there, sort the list using
the Collections.sort method. An alternative is to add all of the input
words, sort the list, and then remove duplicates. Which approach is better? Can
you figure out an
In your SortedArrayListWordListPrinter class, you can check (rapidly, using Collections.binarySearch) to see if a new word is already in theList, and where it should be inserted if it is not there. In this way, the list can be always kept sorted.
In your SortedLinkedListWordListPrinter class, the process (and code) will be almost identical to the SortedArrayListWordListPrinter class. This class is only here for efficiency comparison with the previous class.
The last class, TreeSetWordListPrinter, should have the shortest code, since a TreeSet object takes care of keeping the set sorted and avoiding duplicates.
If the second argument to outputWordList is a "plain old" PrintStream object, e.g., System.out, then all of the words will be output as one line (unless a new line is written automatically when the buffer gets flushed; the JDK documentation does not seem to say whether this ever happens).
You are to write three classes that extend PrintStream:
There are a couple of special considerations. First, words are passed in to print with a space added at the end. You should chop off the space at the end of the last word in a line before you output that line. If the new word fits on the currrent line without its trailing space, but not fit if we include the space, you should include that word on the current line (without the final space, of course).
Second, what if the desired width is so small that a word (e.g. a URL) is longer than the line? In order to avoid making this program inordinately complicated, I specify that in this case you should output the long word on a line by itself, and let that line be longer than the desired width. The alternative would be to split the word somehow, and I choose not to have you do that.
Be careful that the last line gets printed.
The Text Justification algorithm will ensure that the output from your program is both left and right justified when displayed in a mono-spaced font such as Courier. This paragraph is an example of such justification. All the lines of the output from a given run of your program should have the same length, unless a line contains only one word. Then it will be the length of that word.Where should the spaces be added? Adjacent to where there is already one or more spaces. We do not always want to "super-size" the same space, because it will make the text look funny. What you are to do is to randomly choose a "space position" and add the additional space there.
You may want to consider having JustifyingPrintStream extend TextFillPrintStream.
Here is a simple approach to doing the justification: Create an ArrayList (I will call it blankIndex) that contains the positions of all spaces in the line. Choose a random number, use that number as an index into blankIndex, use the number stored there as an indication of where to add an extra space in the output line, add that space, and adjust the later numbers in blankIndex. Repeat this until the line has the correct length.
While this approach is simple, it is not very efficient, due to the overhead of inserting blanks into a string and adjusting blankIndex after each insertion. Perhaps it would be faster to copy the line into a linked list of characters, and let blankIndex be an ArrayList of references to the nodes in the list. Then when we want to insert a blank, we just insert a new node in the middle of the linked list (nothing else has to be moved), and no changes need to be made to blankListitself. After inserting the correct number of blanks, build a new String from the linked list.
You should run some tests in which you compare running times with various WordListPrinter objects. You should use a "generic" PrintWriter for output for all tests, so that you can focus on timing the list-building portion. Try different sized input files with each WordListPrinter class.
Present your data in tabular and/or graphical form. Does the data that you collected correlate with what was predicted by your analysis?
You may find two methods in the java.jang.System class helpful. They are currentTimeMillis and gc (can you see why the latter might be helpful?
If you work with a partner, only one of you should place code in your turnin directory. In the SortedWordLists directory in your turnin directory, place a plain taxt file called partner.txt that contains only one thing: the username of your partner.
You should also submit the printed "analysis and timing" document described in the previous section. This should be turned in during Wednesday's class.
Your code should not generate any superfluous output, such as a prompt for suer input or a headder message such as "Alphabetical Lists of words:" .
My script will:
* cd to your SortedWordList directory.
* Remove
all .class files that are there.
* javac *.java to compile
all source code.
* For each of my test classes
*
Copy my source file to your directory.
* Compile and
execute my code.
* Compare your output to the correct
output. In all cases except the justified
output, your output should exactly match mine.
* Perhaps do some timing tests.
PLEASE MAKE SURE THAT YOU DO NOT SUBMIT CODE THAT GOES INTO AN INFINITE LOOP.
sliderule 4:46pm > more TestArrayListGeneric.java import java.io.*; public class TestArrayListGeneric { public static void main(String args[]) { new TestArrayListGeneric(); } public TestArrayListGeneric() { try { WordListPrinter wl = new ArrayListWordListPrinter(new BufferedReader(new FileReader("words")), System.out); wl.outputWordList(); } catch (FileNotFoundException e) { System.err.println("File not found"); e.printStackTrace(); } } } sliderule 4:46pm > javac TestArrayListGeneric.java sliderule 4:47pm > java TestArrayListGeneric -- Call Cato I If Ishmael. It November Some There This Whenever With a about account ago all almost an and as ball. before bringing but can. cherish circulation. coffin damp, degree, deliberately driving drizzly especially every feelings find flourish for from funeral get grim growing hand hats have having high himself his how hypos in interest into involuntarily is it it, knew knocking little long me me, me. meet; men methodically mind money moral mouth; my myself nearly never no nothing ocean of off on or other, part particular pausing people's philosophical pistol precisely prevent principle purse, quietly rear regulating requires sail same sea see ship. shore, some soon soul; spleen, stepping street, strong substitute such surprising sword; take that the their then, they this. thought throws time to towards up upon upper very warehouses, watery way whenever with world. would years sliderule 4:47pm > more TestTreeSetJustifying.java import java.io.*; public class TestTreeSetJustifying { public static void main(String args[]) { new TestTreeSetJustifying(); } public TestTreeSetJustifying() { try { WordListPrinter wl = new TreeSetWordListPrinter(new BufferedReader(new FileReader("words")), new JustifyingPrintStream(50,System.out)); wl.outputWordList(); } catch (FileNotFoundException e) { System.err.println("File not found"); e.printStackTrace(); } } } sliderule 4:47pm > java TestTreeSetJustifying -- Call Cato I If Ishmael. It November Some There This Whenever With a about account ago all almost an and as ball. before bringing but can. cherish circulation. coffin damp, degree, deliberately driving drizzly especially every feelings find flourish for from funeral get grim growing hand hats have having high himself his how hypos in interest into involuntarily is it it, knew knocking little long me me, me. meet; men methodically mind money moral mouth; my myself nearly never no nothing ocean of off on or other, part particular pausing people's philosophical pistol precisely prevent principle purse, quietly rear regulating requires sail same sea see ship. shore, some soon soul; spleen, stepping street, strong substitute such surprising sword; take that the their then, they this. thought throws time to towards up upon upper very warehouses, watery way whenever with world. would years sliderule 4:50pm > javac TestTreeSetJustifying.java sliderule 4:57pm > java TestTreeSetJustifying -- Call Cato I If Ishmael. It November Some There This Whenever With a about account ago all almost an and as ball. before bringing but can. cherish circulation. coffin damp, degree, deliberately driving drizzly especially every feelings find flourish for from funeral get grim growing hand hats have having high himself his how hypos in interest into involuntarily is it it, knew knocking little long me me, me. meet; men methodically mind money moral mouth; my myself nearly never no nothing ocean of off on or other, part particular pausing people's philosophical pistol precisely prevent principle purse, quietly rear regulating requires sail same sea see ship. shore, some soon soul; spleen, stepping street, strong substitute such surprising sword; take that the their then, they this. thought throws time to towards up upon upper very warehouses, watery way whenever with world. would yearsNote that the extra spaces were inserted in different places in the two runs of the last program.