Java Buffered Stream IO

Doing IO one byte or one character at a time is almost always extremely inefficient, since the IO call usually forces the OS to make a disk or network access. It is better to buffer such access and work from/to the buffer. There are four Java classes for this:

  • public class BufferedInputStream extends FilterInputStream
  • public class BufferedOutputStream extends FilterOutputStream
  • public class BufferedReader extends Reader
  • public class BufferedWriter extends Writer

Now I don’t have a clue as to why the byte operations BufferedInputStream and BufferedOutputStream are subclasses of the filtered IO streams while BufferedReader and BufferedWriter are not subclasses of FilterReader and FilterWriter. This does give some credence to my earlier observation that the filtered classes are more for clarity of exposition than for technical necessity.

All the constructors have versions to change the default buffer size.  I’d recommend doing some performance testing with various buffer sizes and your typical data streams to get the best performance. For disk IO use a multiple of 4K = 4*1024 = 4096. Here are the constructors:

  • public BufferedInputStream(InputStream in)
  • public BufferedInputStream(InputStream in, int size)
  • public BufferedOutputStream(OutputStream out)
  • public BufferedOutputStream(OutputStream out, int size)
  • public BufferedReader(Reader in)
  • public BufferedReader(Reader in, int size)
  • public BufferedWriter(Writer out)
  • public BufferedWriter(Writer out, int size)

Each of the above constructors can raise the FileNotFoundException.

As before, the first constructor parameter is an abstract class; hence you need to use a subclass which you can instantiate with a file or file name, for example,

  • BufferedWriter bw = new BufferedWriter(new FileWriter(“c:\\mydata\\test.txt”, append), 2*4096);

Since FileWriter is a subclass of Writer all is good. Note the optional boolean append for appending rather than overwriting an existing file. The class BufferedWriter inherits all the methods of Writer. In particular, bw.close() will close bw which includes the (File)Writer on which it is based.

These classes manage the buffer, for reads filling it as necessary, and for writes emptying it as necessary. If you want, this buffer can be managed. Most of the time, however, one can, transparent of the buffer operation, read() or write() a byte or character at a time, or one can use array versions of read and write:

  • public int read(byte[] b, int off, int len) which reads into the array b at most len characters starting at position off. This read returns the number of bytes read or -1 for an end of stream.
  • public void write(byte[] b, int off, int len) which writes len bytes from array b starting at offset off.
  • public int read(char[] c, int off, int len) which reads len characters into array c starting at offset off. It returns the number of characters read, or -1 for an end of stream.
  • public void write(char[] c, int off, int len) writes len characters from array c starting at offset off.

Each of the above methods can raise the IOException.

Note that using the arrays b or c may be LESS EFFICIENT than doing IO byte or character at a time, because the latter directly fills or depletes the IO buffer directly. These methods are useful if your program is already manipulating such arrays. Note that Java IO is intelligent enough not to fill its buffer needlessly and can, in such cases, do IO directly to/from the array provided. In these cases, using the arrays is MORE EFFICIENT than doing byte or character at a time IO. Example cases of the latter typically occur when len is larger than the buffer size.

The classes BufferedReader and BufferedWriter have special facilities for handling line by line character IO. A “line” is defined by a sequence of characters terminated by a line feed (‘\n’), a carriage return (‘\r’), or a carriage return followed immediately by a line feed (“\r\n”). Windows uses “\r\n”, Unix variants use ‘\n’, and the Mac uses ‘\r’.

  • public String readLine() returns a line, not including its termination characters, or null if end of stream.
  • public void newLine() writes a system specific line terminator as defined above.

Each of the above methods can raise an IOException.


Tags: , , , , , , , , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: