You are on page 1of 19

Chapter 11:

Operations on Files

11.0 Introduction

In a computer, variety of data and program codes remain stored in hard disks
in the form of files. You have seen a paper file where several pages containing records of
similar category of information remain attached together within the boundary of a cover
having a filename marked on it.

In a school, classwise Student files are maintained where information about each
student is recorded and all those records are placed together within a file-cover with a name
on it. For different classes, different student files are normally maintained. Again, all such
files can be stored together in the rack of a file-cabinet (like a folder in a computer). To
retrieve any information about a particular student of a specified class, the concerned class
file (identified by the filename) can be taken out and the records within that file can be
searched out to get the details of the student. If the file records remain sorted with respect to
roll-numbers, then the roll-number of the particular student can act as a search key.

A computer can also work in the same fashion. Instead of paper files, computer
will store electronic files in a floppy disk or a hard disk. Therefore, a disk file is nothing but
a collection of records. Each record can contain details about a student. The records can be
typed through a keyboard and saved as a file in a hard disk.

In chapter-8 we have discussed about the basic input/output operations as


supported by the System classes available within the java.lang package. In this chapter we
will first review the earlier portions in totality and then try to learn how to create
application specific files and to read or update information stored in such files, by writing
java programs.

11.0.1 Use of Buffers for I/O Operations

A computer user makes use of a keyboard to enter data. The keyboard


controlling program first stores those data in a buffer (temporary memory location to hold
those data) before transferring them as input to the processing program.

A computer program may use Two buffers – one for Input operation and another for
Output operation. Input can be thought of as a character type data stream coming from a
keyboard and placing them in a temporary storage location called buffer. Similarly, to read
out some data from a stored disk file, that particular file is to be opened first, then the
desired data portion is read by calling the read () method, which puts retrieved data in a
buffer wherefrom further processing or data transfer can take place. The transferred or
processed data is then placed to an output buffer connected to some output device like
VDU-screen or printer.

Keyboard and VDU-screen, being the most commonly used input/output


devices for computer interactions, data streams from those devices are regarded as
Standard IO Streams. When data comes from a (source) disk file and moves to a
(destination) disk file – those kinds of data streams are regarded as File Streams.

11.0.2 Java IO-Streams

Java takes help of java.io package to handle both standard I/O and file streams.
There are two types of io-stream classes – Byte stream and Character stream. Byte
oriented I/O operations are needed to handle binary files that use 8-bit codes. Character
oriented I/O operations are used to handle text files for which java uses 16-bit Unicode.

The Byte Stream Classes can again be divided into –


* Input Stream Classes;
• Output Stream Classes.

Similarly Character Stream Classes can also be divided into –


• Reader Classes
• Writer Classes.

To understand Java’s I/O system, the Stream concept must be made clear to you.

11.1 Streams – byte & character stream

A stream is an abstraction (a conceptual entity) which helps in processing


input/output data or information coming from a file/device or going to a file/device. Input
Devices can put data into the Input Streams and Output Devices can accept data from the
output Streams. Therefore, appropriate I/O streams are to be attached with physical
devices like keyboard, VDU, printer or a disk file, etc.

The Java’s I/O system makes use of I/O Class methods that remain encapsulated in the
java.io package. Therefore, for any I/O operation with files/devices, you have to begin your
program with the statement

import java.io.*

11.1.1 The Byte Stream Classes

Byte streams are defined with two abstract classes – InputStream and OutputStream.
Many useful concrete subclasses are defined by java system from these two abstract classes.
BufferedInputStream (for buffered byte input),

BufferedOutputStream (for buffered byte output),

DataInputStream (for input of java’s allowable data types),


DataOutputStream (for output of java’s allowable data types),

ByteArrayInputStream (for byte array input)


ByteArrayOutputStream (for byte array output),

FileInputStream ( to read byte stream from a file),


FileOutputStream (to write byte stream to a file),

PipedInputStream (for Pipe input),


PipedOutputStream (for Pipe output),

PrintStream (that contains print () and println () methods), etc. ....

The above list shows only a few examples of concrete subclasses of the byte stream
classes.

11 .1.2 The Character Stream Classes

As mentioned before, Character streams also are defined by two abstract classes --
Reader classes and Writer classes. Examples of some of its most useful Concrete
subclasses are given below –

BufferedReader (for input Character stream),


BufferedWriter (for output Character stream),

FileReader (to read from a file),


FileWriter (to write to a file),

InputStreamReader (input stream that translates bytes to characters),


OutputStreamWriter (output stream that translates characters to bytes),

PrintWriter (for character oriented print () and println() operations),


LineNumberReader (for line number count), etc. ...

11.1.3 The Predefined Streams

All Java programs automatically include the java.lang package, which


contains a class called System. That System contains three predefined stream variables –
in, out, and err for standard input device, standard output device and standard error device
respectively. Any programmer in his/her java program(s) can use those stream variables.

System.out refers to the standard output stream for VDU-screen,


System.in refers to the standard input stream from the keyboard. System.err refers to the
standard error stream connected again with the VDU-screen. For this reason any error
message automatically appears on the VDU-screen of your computer. One more point –
three standard stream devices are automatically opened for use by the System, but any other
Input/Output devices are to be explicitly opened for use by the programmer.

11.1.4 Review of Console Input-Output operations

The PC that you are using has a Keyboard, which is taken as a standard input device
for feeding data and instructions. The input part of the console is the keyboard and the
output part of the console is the VDU-screen (monitor). Using a byte stream to read
console input is quite possible but normally not preferred. The current versions of java
prefer to handle console IO using character-oriented streams.

Java does not have a generalized console input method (like scanf () of C language)
but takes input by reading from the System.in stream. To capture a character-oriented
stream from console input, System.in is to be wrapped in a BufferedReader object. Since
System.in can refer to an object of the InputStream type, which can convert bytes into
characters, so these two are to be combined. To logically connect a BufferedReader with
the console keyboard, we have to make use of two java statements as shown below –

InputStreamReader isr = new InputStreamReader (System.in);


BufferedReader conin = new BufferedReader (isr);

The above two statements can be written as a single statement as shown below –

BufferedReader conin = new BufferedReader (new


InputStreamReader (System.in));

With the press of a key on a keyboard, a key-code gets generated by the its
hardware. To convert that key-code into understandable character code,
InputStreamReader () is got to be called. The character, thus keyed-in, is placed into the
conin (console-in) BufferedReader for the purpose of reading.

To read a character from a BufferedReader, read () method is to be used. The read ()


method reads a character from the input stream and returns a value (in integer)
corresponding that character. It returns (–1) when the end of the stream is reached.

The version of read () that will be used here has the form –

int read() throws IOException


Please note that java always uses its exception handling mechanism to capture
errors and those mechanisms are made active by the java.io.* package. Let us now
examine a few examples of console input/output operations.

Example 11.1 Reading Characters from Keyboard


// Refer to example-8.1

public class BRchin


{
public static void main() throws IOException
{char ch;

BufferedReader conin = new BufferedReader (new InputStreamReader


(System.in));

System.out.println ("Enter characters and ends with 'q':");


do {
ch = (char) conin.read();
System.out.print (ch + " ");
} while (ch != 'q');
System.out.println (" \n Input operation stops as you typed 'q' to quit.");
}

Run this program and see what output do you get by typing characters on the keyboard
[represented as conin object of the BufferedReader class].

To read a string from the keyboard, readLine () method of the BufferedReader


class is got to be used. String input ends when you press ‘return’ or ‘enter’ key of the
keyboard. The program shown in example-11.2 [refer to example-8.2] will allow you to
enter several text lines and the data-entry process ends only when you type ‘stop’. Just
remember that readLine () method activates when you press the return-key.

Example 11.2 Reading Lines of Strings from Keyboard

import java.io.*;
public class BRLin {
public static void main() throws IOException
{
BufferedReader conin = new BufferedReader (new
InputStreamReader (System.in));
String str1 = " start";
System.out.println ("Enter several text lines & end with return");
System.out.println (" Type 'stop' to end further entry.");
do {
str1 = conin.readLine();
} while (!str1.equals("stop"));
System.out.println (str1);
}
}

Run this program to see what happens when you type stop.

11.1.5 Console output with write () method

Console output using print () and println () methods you have already seen.
Those methods are defined in the PrintStream class, which takes care of the System.out
objects. Although System.out deals with byte stream, its use for simple programs is still
acceptable. PrintStream also allows character-based output using the low-level write ()
method. The program below uses write () method to output the character ‘G’ followed by a
new line (\n). The write () method takes help of the byte-value (lower 8-bits only of the 16 –
bit Unicode) of the character G.

Example 11.3 Demo of Console output using write () method

public class ConWrite


{
public static void main()
{
int b;
b = 'G';
System.out.write (b);
System.out.write ('\n');
}
}

It will be much easier to write the above program using print () or println () methods.
Please try that as an exercise.

11.1.6 Console output using PrintWriter Class

Although use of System.out for console output operations is permissible, for real-life
programs use of PrintWriter stream is highly recommended. PrintWriter supports both
the print () and println () methods. The most popular PrintWriter constructor is –
PrintWriter (OutputStream <OSname>, boolean flushOnNewline)

Examples 11.4 Console output using methods of PrintWriter Class


//[Example-8.5 has been repeated here for review]
// In this example <Osname> has been chosen as vdu.
public class PrintWrite
{
public static void main()
{
PrintWriter vdu = new PrintWriter (System.out, true);
vdu.println(" This shows the use of PrintWriter for write on console .");
int i =27;
vdu.println(" Printing an integer value : " + i);
double d= 41.3241;
vdu.println(" Printing a double value : " + d);
char ch = 'A';
vdu.println(" Printing a Character : " + ch);
String str = " Computer";
vdu.println(" Printing a String : " + str);
}
}

Output of this program has been shown in Picture 8.4.

11.2 Files & Operations on files

* A File can be regarded as a collection of records.


* A record can be regarded as a collection of data-items or fields.
* Each data-item should be conformable to some permissible data type.
*Data-items of dissimilar data types can be stored together in a record.

For example, a student record can have a structure (field_name + data types) like –

Student-name Address Parent-Name Age Marks Department


(String) (String) (String) (int) (int) (char)

A document or a text file can be regarded as a collection of sentences of varying word


lengths.
We have already discussed about the importance of files. Now we will see how reading
from and writing to a file can be done using java’s input/output facilities provided by the
java.io package.

Computer files may be of different varieties --a plain text file, a binary file (like an
executable file) or a data file (collection of structured data-types called records). In java, all
such files are byte-oriented but it allows a programmer to wrap a byte-oriented file stream
within a character-oriented object.

FileInputStream and FileOutputStream classes create byte streams linked to files.


The most common constructors of these classes are of the form –

FileInputStream (String fileName) throws IOException


FileOutputStream (String fileName) throws IOException

Where fileName is the name of the file opened for IO operations. If input file is not
found FileNotFoundException is thrown. Similarly when output file cannot be created
FileNotFoundException is also thrown.

After Opening and performing IO operations on an external file (i.e. file stored
in a secondary storage like floppy, disk, etc), that particular file is got to be explicitly closed
by calling the close () method.

Example 11.5 Reading a stored text file and Display it on a screen

// Create first a text file, which is to be read and displayed by this program

import java.io.*;
public class TypeFile
{public static void main(String args[]) //file name passed as argument
throws IOException
{int i;
FileInputStream fin;
try {
fin = new FileInputStream(args[0]);
} catch (FileNotFoundException e )
{System.out.println (" File not found:");
return;
}catch (ArrayIndexOutOfBoundsException e)
{System.out.println (" Type File filename to show");
return;
}
//read the file characters until EOF (-1) is reached
do {
//To read from a file, the read () method of FileInputStream is used.
i = fin.read();
if (i != -1) System.out.print((char) i);
} while (i != -1);
fin.close(); // file closed
}
}

To run this program, you have to create a file first with a filename in any drive under any
directory. In this program the file name has been passed as an argument – args[0] – whose
value as a String you have to pass while running it [Picture 11.1(a)]. However, you could
directly specify the file-name with specific drive and directory in the following fashion –

fin = new FileInputStream(“d:\\mytestfile.txt”);

when the file mytestfile.txt remains stored in the root directory of drive d:
Another point you have note is that instead of one \ -- use of two \\ will be necessary to
inform the java compiler that \ is not to be taken as an escape sequence (like \t, \n, etc) but
should be considered as \ for indicating directory or sub-directory.

Picture 11.1 (a) File name passed as argument

Picture 11.1(b) Contents of the file displayed

11.2.1 Exception Handling using Try/Catch blocks to trap errors

In example-11.5 you have observed the addition of a few words ‘ throws IOException’
along with the main () statement and thereafter use of try {....} and catch {....} blocks.
These try/catch blocks are used to catch error(s), if any, occurred during IO operations and
handle them as per user supplied codes. Try block contains normal processing codes.
Catch block contains codes related to exception handling.

There are many exception classes (defined in java.io package) like IOException,
NumberFormatException, ArithmeticException, RunTimeException, etc.

The keyword throws is used to inform automatically the error handler about the
occurrence of some error. Java treats exceptions as objects, which are passed to exception
handler written by the programmer to enable recovery or to report nature of errors. The
catch {...} block is responsible for handling exceptions.

The parent IOException has several child exceptions like – FileNotFoundException,


EOFException, etc. Exception handling is a vast subject and further discussions about
them have been kept beyond the scope of this book. Let us now pay attention to few more
IO examples.

Example- 11.6 Creating a Text File by Typing on a Key Board

// input – 4 records typed on the console keyboard; output goes to a disk file

import java.io.*;
public class KBtoFileIO
{
public static void main() throws IOException
{
int k, size = 4;

String fileName = ("c:\\kbinput.txt");


BufferedReader KBin = new BufferedReader (new InputStreamReader
(System.in));

try {
BufferedWriter bufwr = new BufferedWriter (new FileWriter
(fileName));

PrintWriter fileOut = new PrintWriter (bufwr);


System.out.println (" Enter records through your key board.");

k =0;
do{
String record = KBin.readLine ();
fileOut.println (record);
k = k+1;} while (k != size);

System.out.println ("Completed records Entry -- Check the File now.");


fileOut.close();
}
catch (IOException e){
System.err.println (" Error in IO operation.");
}
}
}

Run this program to see how records entered through a keyboard get stored as a file named
kbinput.txt (a notepad file) in the C: drive at root [Picture 11.2 (a) & (b)].

Picture 11.2 (a) Text entered during program execution

Picture 11.2 (b) Content of the file kbinput.txt

Example 11.6 can be regarded as an entry-recording program (by increasing the value of
size as per requirements) that can store keyboard transactions in the form of a disk file. One
stored file can be copied to a new location with a new name, if required. Example-11.7 is an
example of such a file copy program.

For copying we have to make use of write () method instead of print () method,
which can be used for VDU screen display only. Let us now examine the java code of such a
copy program. In the program, the source and destination file-names have been specified
directly within the codes. Of course those names could have been collected in an interactive
mode using appropriate keyboard entry operations. As an exercise, you can try that.
Example-11.7 File Copy with a Different Name at a new Location

public class CopyFile


{
public static void main()
throws IOException
{
int i;
FileInputStream fin;
FileOutputStream fout;
// try to open input file
try {
fin = new FileInputStream("d:\\mytestfile.txt");
} catch (FileNotFoundException e){
System.out.println ("Source File not found");
return;
}
// try to open output file
try {
fout = new FileOutputStream("c:\\copy.txt");
} catch (FileNotFoundException e) {
System.out.println (" Error Opening Destination File");
return;
}
catch(ArrayIndexOutOfBoundsException e) {
System.out.println ("CopyFile Sourcefile Destinationfile");
return;
}
// try to perform copy operation

try {
do {
i = fin.read();
if (i != -1) fout.write(i);
} while (i != -1);
} catch (IOException e) {
System.out.println ("copy error");
}
System.out.println ("Copy operation successful.");
fin.close();
fout.close();
}
}

Try also to copy a different file from one drive to another drive having the same
name. Of course, don’t forget to make changes in the program wherever necessary.
In this program three try blocks are used – one for opening the input file for
reading, another for opening the output file for writing and the third one to perform the
copy operations. The first try has one catch block, the second try has two catch blocks and
the third try has only one catch block. Try to understand why it is so.

Remember, if multiple exceptions are supposed to arise from a try block, then for each
exception one-catch block will be necessary. That is why, for the second try block – two
catch blocks – one for FileNotFoundException and another for
ArrayIndexOutOfBoundException have been used in this program.

Also remember that all opened files must be closed before ending a program
involved in file IO operations.

11.3 The class File

The File class deals directly with files and the file system. It may not be out of place to say
a few words about the file system, which is a component that helps managing all the files
stored in a computer system. For interactions with a file, an operating system must know
some details about that file. The details are – name of the file with the path to reach there,
date and time of creation, starting point of its physical storage location, length of the file in
KB or MB, access permissions, date of last access / modification, etc. Such details help
quick and correct retrieval of files. File objects containing file specific such details can be
created from the class File.

New File objects can be initialized with any one of the following constructors: --

File (String <dir-path>);


File (String <dir-path>, String <filename>);
File (File <dir-Obj>, String <filename);

File class has many defined methods like getName (), getParent (), exists (), list (), etc.
In java, a directory is also treated as a file which contains a list of file names and other
directories that can be examined by the list () method of the class File.
Now study example-11.8 to see the use of the class file.

Example-11.8 Demo of Use of the class File

import java.io.File; //File class included from java library


public class FileClassDemo
{
static void show(String s) {
System.out.println(s);
}
public static void main() {

File f1 = new File ("c:\\MyFolder\\JavaHistory.txt");


show ("File name :: " + f1.getName());
show (" Path :: " + f1.getPath());
show (" Parent ::" + f1.getParent());
show (" File Size :: " + f1.length() + " in Bytes.");
show (" Directory contents ::"+ f1.list());
}
}

In my computer, a file named JavaHistory.txt remained stored (Picture 11.3) in MyFolder


directory under the root of c: drive. By running example-11.8, the following output was
obtained: --

Picture 11.3 JavaHistory.txt File Contents

The outputs obtained from example-11.8: --

File name:: JavaHistory.txt


Path:: c:\MyFolder\JavaHistory.txt
Parent:: c:\MyFolder
File Size:: 350 in Bytes.
Directory contents:: null

Try to run the program using a different file already stored in your computer.
This example gives you an idea how the class File methods can be used to know the details
about a stored file.

11.4 Tokens & String Tokenizer

Tokens

In a sentence of any language, several words (like noun, verb, adjective, etc.) are
placed side-by–side with blank spaces in between them. Again to help reading and for
easy understanding, comma or semi-colons are used within a sentence, which always ends
up with a full stop. These blank or white spaces, punctuation symbols, etc help identifying
individual words distinctly and properly.
In a programming language, words can be regarded as “tokens” and they remain
separated by delimiters like white space, language specific reserve words, arithmetical
symbols like +, -, *, %, ^, etc. etc. Therefore, an expression of a Java or any
programming language (equivalent to a sentence of a natural language) can be scanned
and analyzed to separate out the tokens used therein. This process of dividing an
expression string into a set of tokens is known as Parsing.

The String Tokenizer class provides necessary support to the parsing process. That
means, given a programming statement, you can separate out all individual tokens
contained in it using the String Tokenizer class methods.

Delimiters are special characters that separate tokens. The default set of delimiters
consists of white space characters like – space, tab, newline, carriage return, etc. The
constructors used to initialize String Tokenizer objects are as follows: --

StringTokenizer (String < string_to_be_tokenized>); // using default delimiters

StringTokenizer (String < -- do--> , String <delimiters>);


//new set of delimiters
StringTokenizer (String < -- do -- >, String < --do-- >, boolean delimAsToken);
// when delimiters are also to be returned

11.5 String Tokenizer Class Methods

The StringTokenizer class has methods like –


int countTokens();
boolean hasMoreElements();
boolean hasMoreTokens();
Object nextElement();
String nextToken(); etc.

Once a StringTokenizer object gets created, the nextToken () method is used to


extract consecutive tokens. The hasMoreTokens() method returns true so long there are
more tokens to be extracted. Let us now examine example-11.9 that uses the methods of
StringTokenizer class.

Example-11.9 Use of the StringTokenizer class

// A typical java statement is parsed here to extract tokens

import java.util.StringTokenizer;
public class StringTokenDemo
{
static String exp = " BitSet bits = new BitSet(16);";
public static void main() {
StringTokenizer st = new StringTokenizer (exp, " = ; ");
while (st.hasMoreTokens()) {
String val = st.nextToken();
System.out.println (val);
}
}
}

By running this program, you will see the output as shown below: --

BitSet
bits
new
Bitset(16)

All tokens used in the expression “exp” have been scanned.

11.6 Stream Tokenizer

It is often required not only to identify the tokens but also to classify them according
to their types while scanning an input stream. Stream Tokenizer can be used not only
for identifying token types but also for detecting the end of file (EOF), end of lines
(EOL) and counting the total number of tokens. The definition of StreamTokenizer
class takes care of the following instance variables and methods ---

public double nval


public String sval
public static int TT_EOF
public static int TT_EOL
public static int TT_NUMBER
public static int TT_WORD
public int ttype

public StreamTokenizer (Reader <inStream>)


public int nextToken()

Once a StreamTokenizer object is created, one can go on using nextToken() method to


read tokens from the input stream. Token’s type can be tested for integer constants like
TT_EOF, TT_EOL, TT_NUMBER, and TT_WORD. The nval is used to hold numeric
values, sval is used to hold any string values. The ttype is a public integer indicating the
token type that has just been read by the nextToken () method. If any token is a number,
ttype equals TT_NUMBER; and in this way ttype can be TT_EOL, TT_WORD, etc.
Study of example-11.10 at this stage can make your conception more clear.
Example-11.10 Use of StreamTokenizer class for Type detection

import java.io.*;
public class StreamTokenDemo
{
public static void main() throws IOException
{
FileReader infile = new FileReader ("c:\\copy.txt");
StreamTokenizer inpstream = new StreamTokenizer (infile);
int toktype = 0;
int noOftok = 0;
do {
toktype = inpstream.nextToken();
outTType(toktype, inpstream);
noOftok++;
} while (toktype != StreamTokenizer.TT_EOF);
System.out.println (" Number of tokens = " + noOftok);
}

static void outTType(int ttype, StreamTokenizer inpstream)


{switch (ttype) {
case StreamTokenizer.TT_EOL: System.out.println("TT_EOL");
break;

case StreamTokenizer.TT_WORD: System.out.println ("TT_WORD sval = "+


inpstream.sval);
break;

case StreamTokenizer.TT_NUMBER: System.out.println("TT_NUMBER nval =


" + inpstream.nval);
break;

case StreamTokenizer.TT_EOF: System.out.println("TT_EOF");


break;
default: System.out.println("TT_unknown nval= "+ inpstream.nval + " sval= "+
inpstream.sval);
break;

}
}
}

If you run this program, the following output will appear on the terminal window --

TT_WORD sval = This


TT_WORD sval = is
TT_WORD sval = a
TT_WORD sval = test
TT_WORD sval = file
TT_WORD sval = written
TT_WORD sval = by
TT_WORD sval = A.M.Ghosh
TT_WORD sval = to
TT_WORD sval = test
TT_WORD sval = java
TT_WORD sval = fileIO
TT_WORD sval = program.
TT_WORD sval = It
TT_WORD sval = is
TT_WORD sval = stored
TT_WORD sval = in
TT_WORD sval = drive
TT_WORD sval = d
TT_unknown nval = 0.0 sval = null
TT_unknown nval = 0.0 sval = null
TT_WORD sval = jdk1.3
TT_EOF
Number of tokens = 23

Please remember one important point that StreamTokenizer identifies words delimited
only by whitespace or blank space used on both the sides of it. For this reason, you find
“A.M.Ghosh” and “program.” are identified as TT_WORDs in this program.

Program translators during compilation use these two types of Tokenizers. No further
discussion on this advanced topic will be made in this introductory book.

11.7 Conclusions

This chapter has made discussions on the input-output operations in general and on file
operations in particular. Java being an object oriented programming language, all IO
operations are performed by using different methods of java’s IO classes. These methods
handle input and output streams, which may be either of byte or character type. Modern
versions of java prefer character streams, although uses of converter classes that can convert
byte streams to character streams are very often made.

The IO streams are properly buffered to take care of speed mismatches between IO
devices and the processor, which controls the entire program execution. However, a
programmer may keep herself/himself aloof from knowing the inner details. She or he
should have knowledge only about the methods supported by the IO classes that are to be
used, when and how. A good number of examples have been given here for study and use
by the beginners.
Discussions about the class File has also been made here. Finally, the concept of
tokens and how to identify and extract them from a source code file or a text file using
StringTokenizer and StreamTokenizer classes has also been explained and demonstrated.

You might also like