Java Stream IO

When I first began my current programming efforts, I looked up on my book shelf for books on Java. I only found “Head First Java” 2nd Edition, which covers Java 5. I can’t remember when or why I bought it. It is a very friendly, chatty, 688 page book on elementary Java. Even with all those pages, its information on IO is still a little thin, but the web has many resources, not the least of which is the official documentation.

First, there are two sets of IO APIs obtained from

  • import java.io.*;
  • import java.nio.*;

where “nio” stands for “New Input Output”. You’ve got to learn both. The first set of APIs is about processing streams of data. It is blocking IO in that a thread that issues a read() or a write() is blocked until data are available to read or until the write completes. The java.nio API’s are buffered IOs that enable a thread to read from one or more “channels” and get what, if any, data are available and move on. This allows a thread to look for input from one or more sources without blocking.

The next few posts will discuss stream IO, and subsequent posts will discuss channel IO.

A stream is just a sequence of data of some type, the most basic of which are byte and character streams. Unlike in the language C, these are subtly different in Java, and we’ll carefully go over the differences. Use byte IO for binary data such as images, multimedia, executables, zip files, etc. Use character IO for, well, characters, i.e., for text files and streams. Remember, Java characters are 16 bits, and of course bytes are 8 bits.

Byte stream IO classes are descended from abstract classes InputStream and OutputStream. Their key methods are

  • abstract int read()
  • abstract void write(int b)

which respectively read/write a byte from/to the input/output stream.

For read(), the byte read returns an int in the range 0 to 255, unless the end of the input stream has been reached when -1 is returned. The method read() blocks until input data are available, the end of the stream is detected, or an IOException is thrown.

For write(b), the byte written is the eight low order bits of the argument b; the high order bits are ignored. This method can also throw an IOException.

Both classes have other methods, the most important of which is

  • void close()

which closes the input/output stream and releases its system resources. Always call close() when done with the stream.

Byte stream file IO is done with

  • public class FileInputStream extends InputStream
  • public class FileOutputStream extends OutputStream

These have constructors

  • FileInputStream(String name)
  • FileOutputStream(String name, boolean append)

where name is the path name of the file to read/write from/to and append determines whether to append-to or recreate an existing output file. There are variant constructor forms that use the File or the FileDescriptor types. All these constructors throw FileNotFoundException and SecurityException.

Example from the official documentation:

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

public class CopyBytes {
    public static void main(String[] args) throws IOException {

        FileInputStream in = null;
        FileOutputStream out = null;

        try {
            in = new FileInputStream("xanadu.txt");
            out = new FileOutputStream("outagain.txt", false);
            int c;

            while ((c = in.read()) != -1) {
                out.write(c);
            }
        } finally {
            if (in != null) {
                in.close();
            }
            if (out != null) {
                out.close();
            }
        }
    }
}

In the above example, note how main is defined with “throws IOException”.

Character stream IO classes are descended from abstract classes Reader and Writer which have methods

  • int read()
  • void write(int c)
  • abstract void close()

The method c = read() returns a single 16 bit character as the lower 16 bits of int c or -1 if the end of the stream is detected. The method write(c) writes the lower 16 bits of c ignoring the high 16 bits. Both methods can throw an IOException. Both also have array based methods for read and write respectively, which I won’t discuss. Both require close() when finished with the character stream.

Next we have:

  • public class InputStreamReader extends Reader
  • public class OutputStreamWriter extends Writer

For the input of characters, we need a bridge from byte streams to character streams that read bytes and decodes them into characters using the default or a specified charset. The charset used may be specified by name, given explicitly, or given implicitly by the platform’s default charset. Each invocation of one of InputStreamReader’s read() methods may cause one or more bytes to be read from the underlying byte input stream. To enable the efficient conversion of bytes into characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read() operation. We’ll consider later wrapping an InputStreamReader within a BufferedReader for greater efficiency.

For the output of characters, we need a bridge from character streams to byte streams. Characters are encoded into bytes using a specified charset as above. Each invocation of write() converts the character arguments to byte streams which are buffered before being written to the output stream. The characters passed to the write() methods are not buffered, but we’ll see later how to do this buffering by wrapping an OutputStreamWriter within a Buffered Writer.

The class OutputStreamWriter replaces malformed surrogate pairs with the charset’s default substitution sequence. Use the CharsetEncoder class should more control be needed.

The file IO versions of Reader and Writer are:

  • public class FileReader extends InputStreamReader
  • public class FileWriter extends OutputStreamWriter

FileReader and FileWriter are “convenience classes” that assume the default character encodings and buffer sizes are appropriate. To change these, one must reconstruct these classes based on FileInputStream and FileOutputStream. This may be necessary for internationalization. The default (convenience) classes use FileInputStream and FileOutputStream anyway; hence one just has to hack that code.

Constructors:

  • FileReader(String name)
  • FileWriter(String name, boolean append)

There are File and FileDescriptor versions of these constructors as well.

Example from the official documentation:

import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class CopyCharacters {
    public static void main(String[] args) throws IOException {

        FileReader inputStream = null;
        FileWriter outputStream = null;

        try {
            inputStream = new FileReader("xanadu.txt");
            outputStream = new FileWriter("characteroutput.txt", false);

            int c;
            while ((c = inputStream.read()) != -1) {
                outputStream.write(c);
            }
        } finally {
            if (inputStream != null) {
                inputStream.close();
            }
            if (outputStream != null) {
                outputStream.close();
            }
        }
    }
}

Again note how main is defined with “throws IOException”. The key differences between these two examples are the use of the lower 8 bits of the output parameter for byte IO and the use of the lower 16 bits for character IO. Also for characters, the byte to/from character conversions are important.

My next posts will consider already hinted at IO buffering and its parent Filter Streams. This has to precede the posts on line oriented IO.

Advertisements

Tags: , , , , , , , , , , , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: