Archive for September, 2013

Console IO

2013/09/24

Some devices don’t have a console, for example a variable speed pump, and others have very minimal ones, for example your microwave oven. A console is typically connected to a keyboard of sorts. In general, consoles are very platform dependent in how they are implemented and how one deals with them. The class java.io.Console is Java’s way of abstracting this concept to do IO to the console in a platform independent way. System.in and System.out will also work, although Console has a nice method for reading and not echoing passwords.

  • public final class Console extends Object implements Flushable

If a virtual machine has a console, it will be represented by a unique object of type Console, and a reference to this object can be obtained by invoking the method System.console(). If no console is available, this method will return null. There is no constructor for Console.

  • Console console = System.console()

Console itself has a number of IO methods

  • void flush() //console.flush() flushes console and forces buffered output to be written immediately
  • Reader reader() //console.reader() returns the unique Reader associated with this console
  • PrintWriter writer() //console.writer() returns the unique PrintWriter associated with this console
  • String readLine(String fmt, Object …args) //console.readLine(…) writes a formatted prompt to console, then reads a single line of text.
  • char[] readPassword(String fmt, Object …args) //console.readPassword(…) writes a formatted prompt and reads a passphrase without echoing it; console.readPassword() does the same without the prompt
  • void format(String fmt, Object … args) //console.format( …) writes a formatted string to the console’s output stream; printf is a synonym

Methods readLine, readPassword, format, and printf can throw IllegalFormatException if the format has illegal syntax or is incompatible with its arguments. There is no close() method. If close() is issued on console.reader() or console.writer() then it has no effect. Similarly a ^D, ^C, or ^Z will return null but will not close the input stream of console, and subsequent characters may be read or written.

The reader() method is useful when one wants to use Scanner on the console via

  • Scanner scanner = new Scanner(console.reader())

Of course, one could just as easily use

  • String line = console.readLine(…)

and then use

  • String[] buf = line.split(regex)

or set up a Scanner on the String line via

  • Scanner scanner = new Scanner(line)

One other detail, System.console() returns null from within an IDE such as Eclipse. A workaround by abstracting console IO is described by McDowell.

The typical startup code using Console looks like:

import java.io.*;
public class ConsoleTest {
  public static void main(String[] args) {
    Console console = System.console();
    if (console != null) {
      boolean check = true;
      String user = null;
      char[] pswd = null;
        while (check) {
          user = console.readLine("Username:  ");
          pswd = console.readPassword("Password:  ");
          check = !checkPassword(user, pswd);
          //choose any verification method...
          pswd = null;
         //so that pswd doesn't stay around very long
          };  //end while
      };  //end if
    //now continue. Either no console exists
    // or username/password is verified
    //more code here...
    };  //end main
  };  //end ConsoleTest

This code will work in an IDE, since the username/password test will be skipped.

Java PrintStream and PrintWriter

2013/09/19

The purpose of these two classes is to write formatted data to byte and character output streams using familiar format, printf, print, and println methods.

  • public class PrintStream extends FilterOutputStream implements Appendable, Closeable, Flushable
  • public class PrintWriter extends Writer implements Appendable, AutoCloseable, Closeable, Flushable

To another output stream, PrintStream adds the ability to print representations of various data values. Its methods never throw an IOException, but rather set a flag that can be tested via the checkError() method. Optionally, a PrintStream can be constructed to flush automatically whenever a byte array is written, a println method is invoked, or a newline character is written. All characters printed by a PrintStream method are converted into bytes using the default character set for the platform. To write characters rather than bytes, use PrintWriter.

PrintWriter prints formatted representations of objects to a text-output stream. It implements all print methods of PrintStream, but does not contain methods for writing raw bytes. Its methods never throw an IOException, and a client should use checkError() to discover errors. I have no idea why PrintWriter isn’t a subclass of FilterWriter, which would be nicely symmetric with PrintStream being a subclass of FilterOutputStream.

Constructors:

  • public void PrintStream(OutputStream out, boolean autoFlush)
  • public void PrintStream(String fileName) autoFlush assumed false
  • public void PrintWriter(OutputStream out, boolean autoFlush)
  • public void PrintWriter(String fileName) autoFlush assumed false
  • public void PrintWriter(Writer out, boolean autoFlush)

For each type T, there are new methods

  • void print(T t) prints t.toString()
  • void println(T t) prints t.toString() + EOL
  • void printf(String format, args) prints the list of args of various types T according to the format

In addition, the following methods are overridden so that they also don’t throw an IOException.

  • void write(int c) to write a character c
  • void write(char[ ] buf, int off, int len)

Java Buffered Stream IO

2013/09/18

Doing IO one byte or one character at a time is almost always extremely inefficient, since the IO call usually forces the OS to make a disk or network access. It is better to buffer such access and work from/to the buffer. There are four Java classes for this:

  • public class BufferedInputStream extends FilterInputStream
  • public class BufferedOutputStream extends FilterOutputStream
  • public class BufferedReader extends Reader
  • public class BufferedWriter extends Writer

Now I don’t have a clue as to why the byte operations BufferedInputStream and BufferedOutputStream are subclasses of the filtered IO streams while BufferedReader and BufferedWriter are not subclasses of FilterReader and FilterWriter. This does give some credence to my earlier observation that the filtered classes are more for clarity of exposition than for technical necessity.

All the constructors have versions to change the default buffer size.  I’d recommend doing some performance testing with various buffer sizes and your typical data streams to get the best performance. For disk IO use a multiple of 4K = 4*1024 = 4096. Here are the constructors:

  • public BufferedInputStream(InputStream in)
  • public BufferedInputStream(InputStream in, int size)
  • public BufferedOutputStream(OutputStream out)
  • public BufferedOutputStream(OutputStream out, int size)
  • public BufferedReader(Reader in)
  • public BufferedReader(Reader in, int size)
  • public BufferedWriter(Writer out)
  • public BufferedWriter(Writer out, int size)

Each of the above constructors can raise the FileNotFoundException.

As before, the first constructor parameter is an abstract class; hence you need to use a subclass which you can instantiate with a file or file name, for example,

  • BufferedWriter bw = new BufferedWriter(new FileWriter(“c:\\mydata\\test.txt”, append), 2*4096);

Since FileWriter is a subclass of Writer all is good. Note the optional boolean append for appending rather than overwriting an existing file. The class BufferedWriter inherits all the methods of Writer. In particular, bw.close() will close bw which includes the (File)Writer on which it is based.

These classes manage the buffer, for reads filling it as necessary, and for writes emptying it as necessary. If you want, this buffer can be managed. Most of the time, however, one can, transparent of the buffer operation, read() or write() a byte or character at a time, or one can use array versions of read and write:

  • public int read(byte[] b, int off, int len) which reads into the array b at most len characters starting at position off. This read returns the number of bytes read or -1 for an end of stream.
  • public void write(byte[] b, int off, int len) which writes len bytes from array b starting at offset off.
  • public int read(char[] c, int off, int len) which reads len characters into array c starting at offset off. It returns the number of characters read, or -1 for an end of stream.
  • public void write(char[] c, int off, int len) writes len characters from array c starting at offset off.

Each of the above methods can raise the IOException.

Note that using the arrays b or c may be LESS EFFICIENT than doing IO byte or character at a time, because the latter directly fills or depletes the IO buffer directly. These methods are useful if your program is already manipulating such arrays. Note that Java IO is intelligent enough not to fill its buffer needlessly and can, in such cases, do IO directly to/from the array provided. In these cases, using the arrays is MORE EFFICIENT than doing byte or character at a time IO. Example cases of the latter typically occur when len is larger than the buffer size.

The classes BufferedReader and BufferedWriter have special facilities for handling line by line character IO. A “line” is defined by a sequence of characters terminated by a line feed (‘\n’), a carriage return (‘\r’), or a carriage return followed immediately by a line feed (“\r\n”). Windows uses “\r\n”, Unix variants use ‘\n’, and the Mac uses ‘\r’.

  • public String readLine() returns a line, not including its termination characters, or null if end of stream.
  • public void newLine() writes a system specific line terminator as defined above.

Each of the above methods can raise an IOException.

Java Filter Streams

2013/09/16

Unix popularized small programs that read from an input stream, processed it a bit, and wrote the result to an output stream. Usually done with pipes where A | B | C had B reading from the output of A and writing to the input of C, and had the obvious process overhead. Java provides something different, and perhaps less flexible, with filter stream classes.

  • public class FilterInputStream extends InputStream
  • public class FilterOutputStream extends OutputStream
  • public abstract class FilterReader extends Reader
  • public abstract class FilterWriter extends Writer

whose constructors are, respectively,

  • protected FilterInputStream(InputStream in)
  • protected FilterOutputStream(OutputStream out)
  • protected FilterReader(Reader in)
  • protected FilterWriter(Writer out)

Note that the constructor parameters to FilterInputStream, FilterOutputStream, FilterReader, and FilterWriter are the abstract classes. This is a bit of a pain; it forces you to use subclasses of these parameters in order to construct a subclass of FilterInputStream, FilterOutputStream, FilterReader or FilterWriter. We’ll see this in the next example set.

All the methods of these classes simply override the corresponding methods of the parent classes using in.read(), out.write(), etc. To use them, one subclasses one of them and once again overrides relevant classes. Other than providing sort of a template for such overrides, one could just as easily subclass InputStream, OutputStream, Reader, and Writer; however, using the filtered streams indicates intention and thus makes program code more understandable. If your intention is to filter, then it probably isn’t to change low level details of any read() or write() method.

Examples: filtering out successive blanks in a file stream

import java.io.*;
public class FilterMultipleBlanks extends FilterReader {
  public FilterMultipleBlanks(FilterReader in) {
    super(in);
    }; // end constructor
  char blank = ' ' ;
  boolean lastCharBlank = false;
  public int read() {
  int chr = super.read()
  if (lastCharBlank) {
    while ((char)chr == blank) chr = super.read();
      lastCharBlank = false;
      return chr;
    }; //end if lastCharBlank
    if ((char)chr == blank) lastCharBlank = true;
    return chr;
    }; // end read
  }; //end filter stream class

And next a test program for FilterMultipleBlanks:

import java.io.*;
public class FilterMultipleBlanksTest {
  public static void main (String[ ] args) throws IOException {
    FileReader in = new FileReader("test.txt");
    FilterMultipleBlanks fmb = new FilterMultipleBlanks(in);
    FileWriter out = new FileWriter("testresults.txt");
    int chr = fmb.read();
    while (chr != -1) {
      out.write(chr);
      chr = fmb.read();
      }; //end while
    fmb.close();
    out.close();
    }; //end main
  }; //end class

Note that FileReader in is passed to the constructor for FilterMultipleBlanks. Since FileReader is a subclass of Reader, this is ok.

The logic which skips over repeated blanks could be replaced by logic for logging, counting, or any other input filtering.

Java Stream IO

2013/09/15

When I first began my current programming efforts, I looked up on my book shelf for books on Java. I only found “Head First Java” 2nd Edition, which covers Java 5. I can’t remember when or why I bought it. It is a very friendly, chatty, 688 page book on elementary Java. Even with all those pages, its information on IO is still a little thin, but the web has many resources, not the least of which is the official documentation.

First, there are two sets of IO APIs obtained from

  • import java.io.*;
  • import java.nio.*;

where “nio” stands for “New Input Output”. You’ve got to learn both. The first set of APIs is about processing streams of data. It is blocking IO in that a thread that issues a read() or a write() is blocked until data are available to read or until the write completes. The java.nio API’s are buffered IOs that enable a thread to read from one or more “channels” and get what, if any, data are available and move on. This allows a thread to look for input from one or more sources without blocking.

The next few posts will discuss stream IO, and subsequent posts will discuss channel IO.

A stream is just a sequence of data of some type, the most basic of which are byte and character streams. Unlike in the language C, these are subtly different in Java, and we’ll carefully go over the differences. Use byte IO for binary data such as images, multimedia, executables, zip files, etc. Use character IO for, well, characters, i.e., for text files and streams. Remember, Java characters are 16 bits, and of course bytes are 8 bits.

Byte stream IO classes are descended from abstract classes InputStream and OutputStream. Their key methods are

  • abstract int read()
  • abstract void write(int b)

which respectively read/write a byte from/to the input/output stream.

For read(), the byte read returns an int in the range 0 to 255, unless the end of the input stream has been reached when -1 is returned. The method read() blocks until input data are available, the end of the stream is detected, or an IOException is thrown.

For write(b), the byte written is the eight low order bits of the argument b; the high order bits are ignored. This method can also throw an IOException.

Both classes have other methods, the most important of which is

  • void close()

which closes the input/output stream and releases its system resources. Always call close() when done with the stream.

Byte stream file IO is done with

  • public class FileInputStream extends InputStream
  • public class FileOutputStream extends OutputStream

These have constructors

  • FileInputStream(String name)
  • FileOutputStream(String name, boolean append)

where name is the path name of the file to read/write from/to and append determines whether to append-to or recreate an existing output file. There are variant constructor forms that use the File or the FileDescriptor types. All these constructors throw FileNotFoundException and SecurityException.

Example from the official documentation:

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;

public class CopyBytes {
    public static void main(String[] args) throws IOException {

        FileInputStream in = null;
        FileOutputStream out = null;

        try {
            in = new FileInputStream("xanadu.txt");
            out = new FileOutputStream("outagain.txt", false);
            int c;

            while ((c = in.read()) != -1) {
                out.write(c);
            }
        } finally {
            if (in != null) {
                in.close();
            }
            if (out != null) {
                out.close();
            }
        }
    }
}

In the above example, note how main is defined with “throws IOException”.

Character stream IO classes are descended from abstract classes Reader and Writer which have methods

  • int read()
  • void write(int c)
  • abstract void close()

The method c = read() returns a single 16 bit character as the lower 16 bits of int c or -1 if the end of the stream is detected. The method write(c) writes the lower 16 bits of c ignoring the high 16 bits. Both methods can throw an IOException. Both also have array based methods for read and write respectively, which I won’t discuss. Both require close() when finished with the character stream.

Next we have:

  • public class InputStreamReader extends Reader
  • public class OutputStreamWriter extends Writer

For the input of characters, we need a bridge from byte streams to character streams that read bytes and decodes them into characters using the default or a specified charset. The charset used may be specified by name, given explicitly, or given implicitly by the platform’s default charset. Each invocation of one of InputStreamReader’s read() methods may cause one or more bytes to be read from the underlying byte input stream. To enable the efficient conversion of bytes into characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read() operation. We’ll consider later wrapping an InputStreamReader within a BufferedReader for greater efficiency.

For the output of characters, we need a bridge from character streams to byte streams. Characters are encoded into bytes using a specified charset as above. Each invocation of write() converts the character arguments to byte streams which are buffered before being written to the output stream. The characters passed to the write() methods are not buffered, but we’ll see later how to do this buffering by wrapping an OutputStreamWriter within a Buffered Writer.

The class OutputStreamWriter replaces malformed surrogate pairs with the charset’s default substitution sequence. Use the CharsetEncoder class should more control be needed.

The file IO versions of Reader and Writer are:

  • public class FileReader extends InputStreamReader
  • public class FileWriter extends OutputStreamWriter

FileReader and FileWriter are “convenience classes” that assume the default character encodings and buffer sizes are appropriate. To change these, one must reconstruct these classes based on FileInputStream and FileOutputStream. This may be necessary for internationalization. The default (convenience) classes use FileInputStream and FileOutputStream anyway; hence one just has to hack that code.

Constructors:

  • FileReader(String name)
  • FileWriter(String name, boolean append)

There are File and FileDescriptor versions of these constructors as well.

Example from the official documentation:

import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class CopyCharacters {
    public static void main(String[] args) throws IOException {

        FileReader inputStream = null;
        FileWriter outputStream = null;

        try {
            inputStream = new FileReader("xanadu.txt");
            outputStream = new FileWriter("characteroutput.txt", false);

            int c;
            while ((c = inputStream.read()) != -1) {
                outputStream.write(c);
            }
        } finally {
            if (inputStream != null) {
                inputStream.close();
            }
            if (outputStream != null) {
                outputStream.close();
            }
        }
    }
}

Again note how main is defined with “throws IOException”. The key differences between these two examples are the use of the lower 8 bits of the output parameter for byte IO and the use of the lower 16 bits for character IO. Also for characters, the byte to/from character conversions are important.

My next posts will consider already hinted at IO buffering and its parent Filter Streams. This has to precede the posts on line oriented IO.