created 05/08/00


Programming Exercises

Exercise 1 — C-style Loop

Modify the first example in the chapter so that it uses the C-style input loop.

Click here to go back to the main menu.

Exercise 2—Shorter File Copy Program

The file copy example program in the chapter is longer than it needs to be. Make it shorter by using fewer try{} blocks. (This will result in less specific error messages).

Click here to go back to the main menu.

Exercise 3—GUI File Copy

Write a file copy program that uses a graphical user interface. The user enters the name of the file to copy and the name of the destination file in TextFields and clicks a button to perform the copy. Write error messages into another TextField.

For a nicer program, look in your documentation for the FileDialog class. Include it in the GUI so the user can choose the source file graphically.

Click here to go back to the main menu.

Exercise 4—Reading Random Integer Data

Write a class that reads data from a file containing integers in character format. This class will be a software tool that other programs can use to simplify their own input. The constructor for the class will use the name of the input file as a parameter, and will create the appropriate streams. Write a close() method and a method int getNextInt() that returns the value of the next integer from the stream. When an error is encountered, these methods will write an error message and stop the program.

The input file may have none, one, or several integers per line. A suitable file could be created by the program for Exercise 1 of the previous chapter. The class will work with any number of integers total, and a varying number of integers per line.

Use BufferedReader, FileReader, parseInt(), and StringTokenizer.

Test your class by using it in a program that writes the integers from the input file on the monitor, one per line. Once the class is debugged there are many other programs you can write that use it:

Click here to go back to the main menu.

Exercise 5 — HTML Java Source Code Reserved Word Highlighter

Write a program that inputs a Java source code file and outputs a copy of that file with Java keybords surrounded with HTML tags for bold type. For example this input:

public class JavaSource
{
  public static void main ( String[] args )
  {
    if ( args.length == 3 )
      new BigObject();
    else
      System.out.println("Too few arguments.");
  }
}

will be transformed into:

<b>public</b> <b>class</b> JavaSource
{
  <b>public</b> <b>static</b> <b>void</b> main ( String[] args )
  {
    <b>if</b> ( args.length == 3 )
      <b>new</b> BigObject();
    <b>else</b>
      System.out.println("Too few arguments.");
  }
}

In a browser the code will look like this:

public class JavaSource
{
  public static void main ( String[] args )
  {
    if ( args.length == 3 )
      new BigObject();
    else
      System.out.println("Too few arguments.");
  }
}
Click here to go back to the main menu.

Exercise 6 — HTML Filter

Any text editor, such as Notepad, can be used to create web pages. Unfortunately, these editors usually do not check spelling. Word processors can open a text file and check its spelling. But when a file is sprinkled with HTML tags they all are flagged as errors and the real spelling errors are hard to see. This exercise is to write a utility that strips the HTML tags from a text file.

Write a program that reads in a text file and writes out another text file. The input file may have any number of HTML tags per line. The output file will be a copy of the input file but with spaces substituted for each HTML tag. The program will not check HTML syntax; it looks at the file as a stream of tokens and substitutes spaces for each token that is a tag. For this program, an HTML tag is any token that looks like one of these:

<Word>       </Word>

Assume that Word is a single word (perhaps just one letter or no letters) and that there are no spaces between the left and right angle brackets. With this definition, the following are tags:

<p>       </p>     <em>     </em>
<rats>       </1234>     <blockquote>     </>

With this definition, the following are NOT tags (although some are with real HTML):

< p>       </ p>     <em >     </e m>
<table border cellpadding=5>     <block quote>     < /em>
Challenging Exercise: Write the program to filter out any tag that looks like one of these:

<Word .... >       </Word ... >

Now Word is a single word that immediately follows the left angle bracket, but may be followed by more text which may include spaces. A tag ends with a right angle bracket, which might or might not be preceeded by a space. Assume that a tag starts and ends on the same line. With this definition, the following are tags

<p>       </p>     <em >     </em >
<table border cellpadding=5>     <word another word>     </x y z>

Start by setting a flag to false. Now look at the input stream of tokens one by one. When a token starts a tag set a boolean flag to true. While the flag is true discard tokens until encountering a tag end (either stuff> or >). Set the flag to false.

Click here to go back to the main menu.

File IO Project

First Part:

Letters of the alphabet occur in text at different frequencies. Write a program that confirms this phenomonon. Your program will be invoked from the command line like this:

  C:\mydir> java freqCount avonlea.txt avonlea.rept -all
  

It will then read through the first text file on the command line (in this case "avonlea.txt") accumulating the counts for each letter. When it reaches the end of the file, it will write a report (in this case "avonlea.rpt") that displays the total number of alphabetic characters "a-zA-Z" and for each character the number of times it occured and the relative frequency with which it occured. In counting characters, regard lower case "a-z" and upper case "A-Z" characters as identical.

You will need an array of 26 long integers, one per character. To increment the count for a particular character you will have to convert it into an index in the range 0..25. Do this by first determining which range the character belongs in:"a-z" or "A-Z" and then subtracting 'a' or 'A' from it, as appropriate:

       int inx = (int)ch - (int)'A' ;
       count[inx]++ ;
  

Discard characters not in either range without increasing any count.

Second Part:

Do the relative frequencies of the initial letters of words differ from the relative frequencies for all letters in a text? Add logic to the program so that it examines only the first character in each word. Allow the user to chose between the two options with a switch on the command line:

  C:\mydir>java freqCount avonlea.txt avonlea.rept -first
  

For this option it will be convenient to use the Java class StringTokenizer to deliver individual words one at a time. In the string of delimiters passed to StringTokenizer include whitespace and all punctuation that might be at the start or end of a word. This is not quite good enough for an accurate count because some words will be split between lines

  It is often true that handling the an-
  noying details makes up the large maj-
  ority of the statements in a pro-
  gram.
  

So, if the last token in a line (returned by StringTokenizer) ends with '-', don't include the first letter of the first token on the next line in the count.

Testing:

For testing, create some really simple files that demonstrate that your program is working. For instance:

  AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
  AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
  aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
  aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
  !*$#)#%$#) @##$%))!__ !#4241-432 !_#*%_@*(* !@%#*#.,?+
  

and:

  AAAAA AAAAA-
  BBBBB BBBBB-
  CCCCC CCCCC-

  DDDDD-DDDDD
  EEEEE-EEEEE
  

The first draft of your program will write its count to the monitor for easy debugging. Add text file output later. It is probably wise to write the first part of the program and debug it before moving on to the second option.

Download a text file of a novel of at least 400K bytes from Project Gutenberg.. Use a file that does not use HTML formatting tags (which would confuse the count). Delete the text at the beginning of the file that is not part of the novel (the legalese and documentation). Run both options of the program on the text.

Example:

Here is a sample run of my program with the text "Ann of Avonlea" from project Gutenberg.

  C:\mydir>java freqCount avonlea.txt avonlea.rept -all

  C:\mydir>type avonlea.rept
  Total alphabetical characters:  373267

   A:       31840       8.53%
   B:        5942       1.59%
   C:        7627       2.04%
   D:       17541       4.69%
   E:       45614      12.22%
   F:        7191       1.92%
   G:        7960       2.13%
   H:       22500       6.02%
   I:       25095       6.72%
   J:         733       0.19%
   K:        3443       0.92%
   L:       17534       4.69%
   M:        9324       2.49%
   N:       26516       7.1%
   O:       27344       7.32%
   P:        6083       1.62%
   Q:         275       0.07%
   R:       21285       5.7%
   S:       23398       6.26%
   T:       32579       8.72%
   U:       10720       2.87%
   V:        4201       1.12%
   W:        9063       2.42%
   X:         546       0.14%
   Y:        8745       2.34%
   Z:         168       0.04%
  

End of Exercises.

Click here to go back to the main menu.