| Exploring Java:By Patrick Niemeyer & Joshua Peck source ref: ebookjava.html |
In this chapter, we'll introduce the framework of the Java language and some of its fundamental tools. I'm not going to try to provide a full language reference here. Instead, I'll lay out the basic structures of Java with special attention to how it differs from other languages. For example, we'll take a close look at arrays in Java, because they are significantly different from those in some other languages. We won't, on the other hand, spend much time explaining basic language constructs like loops and control structures. We won't talk much about Java's object-oriented features here, as that's covered in Chapter 5, Objects in Java.
As always, we'll try to provide meaningful examples to illustrate how to use Java in everyday programming tasks.
Java is a language for the Internet. Since the people of the Net speak and write in many different human languages, Java must be able to handle a number of languages as well. One of the ways in which Java supports international access is through Unicode character encoding. Unicode uses a 16-bit character encoding; it's a worldwide standard that supports the scripts (character sets) of most languages.[1]
[1] For more information about Unicode, see the following URL: http://www.unicode.org/. Ironically, one listed "obsolete and archaic" scripts not currently supported by the Unicode standard is Javanese--a historical language of the people of the Island of Java.
Java source code can be written using the Unicode character encoding and stored either in its full form or with ASCII-encoded Unicode character values. This makes Java a friendly language for non-English speaking programmers, as these programmers can use their native alphabet for class, method, and variable names in Java code.
The Java char type and String objects also support Unicode. But if you're concerned about having to labor with two-byte characters, you can relax. The String API makes the character encoding transparent to you. Unicode is also ASCII-friendly; the first 256 characters are identical to the first 256 characters in the ISO8859-1 (Latin-1) encoding and if you stick with these values, there's really no distinction between the two.
Most platforms can't display all currently defined Unicode characters. As a result, Java programs can be written with special Unicode escape sequences. A Unicode character can be represented with the escape sequence:
\uxxxx
xxxx is a sequence of one to four hexadecimal digits. The escape sequence indicates an ASCII-encoded Unicode character. This is also the form Java uses to output a Unicode character in an environment that doesn't otherwise support them.
Java stores and manipulates characters and strings internally as Unicode values. Java also comes with classes to read and write Unicode-formatted character streams, as you'll see in Chapter 8, Input/Output Facilities.
Java supports both C-style block comments delimited by /* and */ and C++-style line comments indicated by //:
/* This is a
multiline
comment. */
// This is a single line comment
// and so // is this
As in C, block comments can't be nested. Single-line comments are delimited by the end of a line; extra // indicators inside a single line have no effect. Line comments are useful for short comments within methods because you can still wrap block comments around large chunks of code during development.
By convention, a block comment beginning with /** indicates a special "doc comment." A doc comment is commentary that is extracted by automated documentation generators, such as Sun's javadoc program that comes with the Java Development Kit. A doc comment is terminated by the next */, just as with a regular block comment. Leading spacing up to a * on each line is ignored; lines beginning with @ are interpreted as special tags for the documentation generator:
/** * I think this class is possibly the most amazing thing you will * ever see. Let me tell you about my own personal vision and * motivation in creating it. * <p> * It all began when I was a small child, growing up on the * streets of Idaho. Potatoes were the rage, and life was good... * * @see PotatoPeeler * @see PotatoMasher * @author John 'Spuds' Smith * @version 1.00, 19 Dec 1996 */
javadoc creates HTML class documentation by reading the source code and the embedded comments. The author and version information is presented in the output and the @see tags make hypertext links to the appropriate class documentation. The compiler also looks at the doc comments; in particular, it is interested in the @deprecated tag, which means that the method has been declared obsolete and should be avoided in new programs. The compiler generates a warning message whenever it sees you use a deprecated feature in your code.
Doc comments can appear above class, method, and variable definitions, but some tags may not be applicable to all. For example, a variable declaration can contain only a @see tag. Table 4.1 summarizes the tags used in doc comments.
| Tag | Description | Applies to |
|---|---|---|
| @see | Associated class name | Class, method, or variable |
| @author | Author name | Class |
| @version | Version string | Class |
| @param | Parameter name and description | Method |
| @return | Description of return value | Method |
| @exception | Exception name and description | Method |
| @deprecated | Declares an item obsolete | Class, method, or variable |
The type system of a programming language describes how its data elements (variables and constants) are associated with actual storage. In a statically typed language, such as C or C++, the type of a data element is a simple, unchanging attribute that often corresponds directly to some underlying hardware phenomenon, like a register value or a pointer indirection. In a more dynamic language like Smalltalk or Lisp, variables can be assigned arbitrary elements and can effectively change their type throughout their lifetime. A considerable amount of overhead goes into validating what happens in these languages at run-time. Scripting languages like Tcl and awk achieve ease of use by providing drastically simplified type systems in which only certain data elements can be stored in variables, and values are unified into a common representation such as strings.
As I described in Chapter 1, Yet Another Language?, Java combines the best features of both statically and dynamically typed languages. As in a statically typed language, every variable and programming element in Java has a type that is known at compile-time, so the interpreter doesn't normally have to check the type validity of assignments while the code is executing. Unlike C or C++ though, Java also maintains run-time information about objects and uses this to allow safe run-time polymorphism.
Java data types fall into two categories. Primitive types represent simple values that have built-in functionality in the language; they are fixed elements like literal constants and numeric expressions. Reference types (or class types) include objects and arrays; they are called reference types because they are passed "by reference" as I'll explain shortly.
Numbers, characters, and boolean values are fundamental elements in Java. Unlike some other (perhaps more pure) object-oriented languages, they are not objects. For those situations where it's desirable to treat a primitive value as an object, Java provides "wrapper" classes (see Chapter 7, Basic Utility Classes). One major advantage of treating primitive values as such is that the Java compiler can more readily optimize their usage.
Another advantage of working with the Java virtual-machine architecture is that primitive types are precisely defined. For example, you never have to worry about the size of an int on a particular platform; it's always a 32-bit, signed, two's complement number. Table 4.2 summarizes Java's primitive types.
| Type | Definition |
|---|---|
| boolean | true or false |
| char | 16-bit Unicode character |
| byte | 8-bit signed two's complement integer |
| short | 16-bit signed two's complement integer |
| int | 32-bit signed two's complement integer |
| long | 64-bit signed two's complement integer |
| float | 32-bit IEEE 754 floating-point value |
| double | 64-bit IEEE 754 floating-point value |
If you think the primitive types look like an idealization of C scalar types on a byte-oriented 32-bit machine, you're absolutely right. That's how they're supposed to look. The 16-bit characters were forced by Unicode, and generic pointers were deleted for other reasons we'll touch on later, but in general the syntax and semantics of Java primitive types are meant to fit a C programmer's mental habits. If you're like most of this book's readers, you'll probably find this saves you a lot of mental effort in learning the language.
Variables are declared inside of methods or classes in C style. For example:
int foo; double d1, d2; boolean isFun;
Variables can optionally be initialized with an appropriate expression when they are declared:
int foo = 42; double d1 = 3.14, d2 = 2 * 3.14; boolean isFun = true;
Variables that are declared as instance variables in a class are set to default values if they are not initialized. In this case, they act much like static variables in C or C++. Numeric types default to the appropriate flavor of zero, characters are set to the null character "\0," and boolean variables have the value false. Local variables declared in methods, on the other hand, must be explicitly initialized before they can be used.
Integer literals can be specified in octal (base 8), decimal (base 10), or hexadecimal (base 16). A decimal integer is specified by a sequence of digits beginning with one of the characters 1-9:
int i = 1230;
Octal numbers are distinguished from decimal by a leading zero:
int i = 01230; // i = 664 decimal
(An interesting, but meaningless, observation is that this would make the number 0 an octal value in the eyes of the compiler.)
As in C, a hexadecimal number is denoted by the leading characters 0x or 0X (zero "x"), followed by digits and the characters a-f or A-F, which represent the decimal values 10-15 respectively:
int i = 0xFFFF; // i = 65535 decimal
Integer literals are of type int unless they are suffixed with an L, denoting that they are to be produced as a long value:
long l = 13L; long l = 13; // equivalent--13 is converted from type int
(The lowercase character l ("el") is also acceptable, but should be avoided because it often looks like the numeral 1).
When a numeric type is used in an assignment or an expression involving a type with a larger range, it can be promoted to the larger type. For example, in the second line of the above example, the number 13 has the default type of int, but it's promoted to type long for assignment to the long variable. Certain other numeric and comparison operations also cause this kind of arithmetic promotion. A numeric value can never be assigned to a type with a smaller range without an explicit (C-style) cast, however:
int i = 13; byte b = i; // Compile time error--explicit cast needed byte b = (byte) i; // Okay
Conversions from floating point to integer types always require an explicit cast because of the potential loss of precision.
Floating-point values can be specified in decimal or scientific notation. Floating-point literals are of type double unless they are suffixed with an f denoting that they are to be produced as a float value:
double d = 8.31; double e = 3.00e+8; float f = 8.31F; float g = 3.00e+8F;
A literal character value can be specified either as a single-quoted character or as an escaped ASCII or Unicode sequence:
char a = 'a'; char newline = '\n'; char octalff = \u00ff;
In C, you can make a new, complex data type by creating a structure. In Java (and other object-oriented languages), you instead create a class that defines a new type in the language. For instance, if we create a new class called Foo in Java, we are also implicitly creating a new type called Foo. The type of an item governs how it's used and where it's assigned. An item of type Foo can, in general, be assigned to a variable of type Foo or passed as an argument to a method that accepts a Foo value.
In an object-oriented language like Java, a type is not necessarily just a simple attribute. Reference types are related in the same way as the classes they represent. Classes exist in a hierarchy, where a subclass is a specialized kind of its parent class. The corresponding types have a similar relationship, where the type of the child class is considered a subtype of the parent class. Because child classes always extend their parents and have, at a minimum, the same functionality, an object of the child's type can be used in place of an object of the parent's type. For example, if I create a new class, Bar, that extends Foo, there is a new type Bar that is considered a subtype of Foo. Objects of type Bar can then be used anywhere an object of type Foo could be used; An object of type Bar is said to be assignable to a variable of type Foo. This is called subtype polymorphism and is one of the primary features of an object-oriented language. We'll look more closely at classes and objects in Chapter 5, Objects in Java.
Primitive types in Java are used and passed "by value." In other words, when a primitive value is assigned or passed as an argument to a method, it's simply copied. Reference types, on the other hand, are always accessed "by reference." A reference is simply a handle or a name for an object. What a variable of a reference type holds is a reference to an object of its type (or of a subtype). A reference is like a pointer in C or C++, except that its type is strictly enforced and the reference value itself is a primitive entity that can't be examined directly. A reference value can't be created or changed other than through assignment to an appropriate object. When references are assigned or passed to methods, they are copied by value. You can think of a reference as a pointer type that is automatically dereferenced whenever it's mentioned.
Let's run through an example. We specify a variable of type Foo, called myFoo, and assign it an appropriate object:
Foo myFoo = new Foo(); Foo anotherFoo = myFoo;
myFoo is a reference type variable that holds a reference to the newly constructed Foo object. For now, don't worry about the details of creating an object; we'll cover that in Chapter 5, Objects in Java. We designate a second Foo type variable, anotherFoo, and assign it to the same object. There are now two identical references: myFoo and anotherFoo. If we change things in the state of the Foo object itself, we will see the same effect by looking at it with either reference. The comparable code in C++ would be:
// C++ Foo& myFoo = *(new Foo()); Foo& anotherFoo = myFoo;
We can pass one of the variables to a method, as in:
myMethod( myFoo );
An important, but sometimes confusing distinction to make at this point is that the reference itself is passed by value. That is, the argument passed to the method (a local variable from the method's point of view) is actually a third copy of the reference. The method can alter the state of the Foo object itself through that reference, but it can't change the caller's reference to myFoo. That is, the method can't change the caller's myFoo to point to a different Foo object. For the times we want a method to change a reference for us, we have to pass a reference to the object that contains it, as shown in Chapter 5, Objects in Java.
Reference types always point to objects, and objects are always defined by classes. However, there are two special kinds of reference types that specify the type of object they point to in a slightly different way. Arrays in Java have a special place in the type system. They are a special kind of object automatically created to hold a number of some other type of object, known as the base type. Declaring an array-type reference implicitly creates the new class type, as you'll see in the next section.
Interfaces are a bit sneakier. An interface defines a set of methods and a corresponding type. Any object that implements all methods of the interface can be treated as an object of that type. Variables and method arguments can be declared to be of interface types, just like class types, and any object that implements the interface can be assigned to them. This allows Java to cross the lines of the class hierarchy in a type safe way, as you'll see in Chapter 5, Objects in Java.
Strings in Java are objects; they are therefore a reference type. String objects do, however, have some special help from the Java compiler that makes them look more primitive. Literal string values in Java source code are turned into String objects by the compiler. They can be used directly, passed as arguments to methods, or assigned to String type variables:
System.out.println( "Hello World..." ); String s = "I am the walrus..."; String t = "John said: \"I am the walrus...\"";
The + symbol in Java is overloaded to provide string concatenation; this is the only overloaded operator in Java:
String quote = "Four score and " + "seven years ago,"; String more = quote + " our" + " fathers" + " brought...";
Java builds a single String object from the concatenated strings and provides it as the result of the expression. We will discuss the String class in Chapter 7, Basic Utility Classes.
Although the method declaration syntax of Java is quite different from that of C++, Java statement and expression syntax is very much like that of C. Again, the design intention was to make the low-level details of Java easily accessible to C programmers, so that they can concentrate on learning the parts of the language that are really different. Java statements appear inside of methods and class and variable initializers; they describe all activities of a Java program. Variable declarations and initializations like those in the previous section are statements, as are the basic language structures like conditionals and loops. Expressions are statements that produce a result that can be used as part of another statement. Method calls, object allocations, and, of course, mathematical expressions are examples of expressions.
One of the tenets of Java is to keep things simple and consistent. To that end, when there are no other constraints, evaluations and initializations in Java always occur in the order in which they appear in the code--from left to right. We'll see this rule used in the evaluation of assignment expressions, method calls, and array indexes, to name a few cases. In some other languages, the order of evaluation is more complicated or even implementation dependent. Java removes this element of danger by precisely and simply defining how the code is evaluated. This doesn't, however, mean you should start writing obscure and convoluted statements. Relying on the order of evaluation of expressions is a bad programming habit, even when it works. It produces code that is hard to read and harder to modify. Real programmers, however, are not made of stone, and you may catch me doing this once or twice when I can't resist the urge to write terse code.
As in C or C++, statements and expressions in Java appear within a code block. A code block is syntactically just a number of statements surrounded by an open curly brace ({) and a close curly brace (}). The statements in a code block can contain variable declarations:
{
int size = 5;
setName("Max");
...
}
Methods, which look like C functions, are in a sense code blocks that take parameters and can be called by name.
setupDog( String name ) {
int size = 5;
setName( name );
...
}
Variable declarations are limited in scope to their enclosing code block. That is, they can't be seen outside of the nearest set of braces:
{
int i = 5;
}
i = 6; // compile time error, no such variable i
In this way, code blocks can be used to arbitrarily group other statements and variables. The most common use of code blocks, however, is to define a group of statements for use in a conditional or iterative statement.
Since a code block is itself collectively treated as a statement, we define a conditional like an if/else clause as follows:
if ( condition )
statement;
[ else
statement; ]
Thus, if/else in Java has the familiar functionality of taking either of the forms:
if ( condition )
statement;
or:
if ( condition ) {
[ statement; ]
[ statement; ]
[ ... ]
}
Here the condition is a boolean expression. In the second form, the statement is a code block, and all of its enclosed statements are executed if the conditional succeeds. Any variables declared within that block are visible only to the statements within the successful branch of the condition. Like the if/else conditional, most of the remaining Java statements are concerned with controlling the flow of execution. They act for the most part like their namesakes in C or C++.
The do and while iterative statements have the familiar functionality, except that their conditional test is also a boolean expression. You can't use an integer expression or a reference type; in other words you must explicitly test your value. In other words, while i==0 is legitimate, i is not, unless i is boolean. Here are the forms of these two statements:
while ( conditional )
statement;
do
statement;
while ( conditional );
The for statement also looks like it does in C:
for ( initialization; conditional; incrementor )
statement;
The variable initialization expression can declare a new variable; this variable is limited to the scope of the for statement:
for (int i = 0; i < 100; i++ ) {
System.out.println( i )
int j = i;
...
}
Java doesn't support the C comma operator, which groups multiple expressions into a single expression. However, you can use multiple, comma-separated expressions in the initialization and increment sections of the for loop. For example:
for (int i = 0, j = 10; i < j; i++, j-- ) {
... }
The Java switch statement takes an integer type (or an argument that can be promoted to an integer type) and selects among a number of alternative case branches[2] :
[2] An object-based switch statement is desirable and could find its way into the language someday.
switch ( int expression ) {
case int expression :
statement;
[ case int expression
statement;
...
default :
statement; ]
}
No two of the case expressions can evaluate to the same value. As in C, an optional default case can be specified to catch unmatched conditions. Normally, the special statement break is used to terminate a branch of the switch:
switch ( retVal ) {
case myClass.GOOD :
// something good
break;
case myClass.BAD :
// something bad
break;
default :
// neither one
break;
}
The Java break statement and its friend continue perform unconditional jumps out of a loop or conditional statement. They differ from the corresponding statements in C by taking an optional label as an argument. Enclosing statements, like code blocks and iterators, can be labeled with identifier statements:
one:
while ( condition ) {
...
two:
while ( condition ) {
...
// break or continue point
}
// after two
}
// after one
In the above example, a break or continue without argument at the indicated position would have the normal, C-style effect. A break would cause processing to resume at the point labeled "after two"; a continue would immediately cause the two loop to return to its condition test.
The statement break two at the indicated point would have the same effect as an ordinary break, but break one would break two levels and resume at the point labeled "after one." Similarly, continue two would serve as a normal continue, but continue one would return to the test of the one loop. Multilevel break and continue statements remove much of the need for the evil goto statement in C and C++.
There are a few Java statements we aren't going to discuss right now. The try, catch, and finally statements are used in exception handling, as we'll discuss later in this chapter. The synchronized statement in Java is used to coordinate access to statements among multiple threads of execution; see Chapter 6, Threads for a discussion of thread synchronization.
On a final note, I should mention that the Java compiler flags "unreachable" statements as compile-time errors. Of course, when I say unreachable, I mean those statements the compiler determines won't be called by a static look at compile-time.
As I said earlier, expressions are statements that produce a result when they are evaluated. The value of an expression can be a numeric type, as in an arithmetic expression; a reference type, as in an object allocation; or the special type void, which results from a call to a method that doesn't return a value. In the last case, the expression is evaluated only for its side effects (i.e., the work it does aside from producing a value). The type of an expression is known at compile-time. The value produced at run-time is either of this type or, in the case of a reference type, a compatible (assignable) type.
Java supports almost all standard C operators. These operators also have the same precedence in Java as they do in C, as you can see in Table 4.3.
| Precedence | Operator | Operand Type | Description |
|---|---|---|---|
| 1 | ++, -- | Arithmetic | Increment and decrement |
| 1 | +, - | Arithmetic | Unary plus and minus |
| 1 | ~ | Integral | Bitwise complement |
| 1 | ! | Boolean | Logical complement |
| 1 | ( type ) |
Any | Cast |
| 2 | *, /, % | Arithmetic | Multiplication, division, remainder |
| 3 | +, - | Arithmetic | Addition and subtraction |
| 3 | + | String | String concatenation |
| 4 | << | Integral | Left shift |
| 4 | >> | Integral | Right shift with sign extension |
| 4 | >>> | Integral | Right shift with no extension |
| 5 | <, <=, >, >= |
Arithmetic | Numeric comparison |
| 5 | instanceof | Object | Type comparison |
| 6 | ==, != | Primitive | Equality and inequality of value |
| 6 | ==, != | Object | Equality and inequality of reference |
| 7 | & | Integral | Bitwise AND |
| 7 | & | Boolean | Boolean AND |
| 8 | ^ | Integral | Bitwise XOR |
| 8 | ^ | Boolean | Boolean XOR |
| 9 | | | Integral | Bitwise OR |
| 9 | | | Boolean | Boolean OR |
| 10 | && | Boolean | Conditional AND |
| 11 | || | Boolean | Conditional OR |
| 12 | ?: | NA | Conditional ternary operator |
| 13 | = | Any | Assignment |
| 13 | *=, /=, %=, +=, -=, <<=, >>=, >>>=, &=, ^=, |= |
Any | Assignment with operation |
There are a few operators missing from the standard C collection. For example, Java doesn't support the comma operator for combining expressions, although the for statement allows you to use it in the initialization and increment sections. Java doesn't allow direct pointer manipulation, so it does not support the reference (*), dereference (&), and sizeof operators.
Java also adds some new operators. As we've seen, the + operator can be used with String values to perform string concatenation. Because all integral types in Java are signed values, the >> operator performs a right-shift operation with sign extension. The >>> operator treats the operand as an unsigned number and performs a right shift with no extension. The new operator is used to create objects; we will discuss it in detail shortly.
While variable initialization (i.e., declaration and assignment together) is considered a statement, variable assignment alone is an expression:
int i, j; i = 5; // expression
Normally, we rely on assignment for its side effects alone, but, as in C, an assignment can be used as a value in another part of an expression:
j = ( i = 5 );
Again, relying on order of evaluation extensively (in this case, using compound assignments in complex expressions) can make code very obscure and hard to read. Do so at your own peril.
The expression null can be assigned to any reference type. It has the meaning of "no reference." A null reference can't be used to select a method or variable and attempting to do so generates a NullPointerException at run-time.
Using the dot (.) to access a variable in an object is a type of expression that results in the value of the variable accessed. This can be either a numeric type or a reference type:
int i; String s; i = myObject.length; s = myObject.name;
A reference type expression can be used in further evaluations, by selecting variables or calling methods within it:
int len = myObject.name.length(); int initialLen = myObject.name.substring(5, 10).length();
Here we have found the length of our name variable by invoking the length() method of the String object. In the second case, we took an intermediate step and asked for a substring of the name string. The substring method of the String class also returns a String reference, for which we ask the length. (Chapter 7, Basic Utility Classes describes all of these String methods in detail.)
A method invocation is basically a function call, or in other words, an expression that results in a value, the type of which is the return type of the method. Thus far, we have seen methods invoked via their name in simple cases:
System.out.println( "Hello World..." ); int myLength = myString.length();
When we talk about Java's object-oriented features in Chapter 5, Objects in Java, we'll look at some rules that govern the selection of methods.
Like the result of any expression, the result of a method invocation can be used in further evaluations, as we saw above. Whether to allocate intermediate variables and make it absolutely clear what your code is doing or to opt for brevity where it's appropriate is a matter of coding style.
Objects in Java are allocated with the new operator:
Object o = new Object();
The argument to new is a constructor that specifies the type of object and any required parameters to create it. The return type of the expression is a reference type for the created object.
We'll look at object creation in detail in Chapter 5, Objects in Java. For now, I just want to point out that object creation is a type of expression, and that the resulting object reference can be used in general expressions. In fact, because the binding of new is "tighter" than that of the dot-field selector, you can easily allocate a new object and invoke a method in it for the resulting expression:
int hours = new Date().getHours();
The Date class is a utility class that represents the current time. Here we create a new instance of Date with the new operator and call its getHours() method to retrieve the current hour as an integer value. The Date object reference lives long enough to service the method call and is then cut loose and garbage collected at some point in the future.
Calling methods in object references in this way is, again, a matter of style. It would certainly be clearer to allocate an intermediate variable of type Date to hold the new object and then call its getHours() method. However, some of us still find the need to be terse in our code.
The instanceof operator can be used to determine the type of an object at run-time. instanceof returns a boolean value that indicates whether an object is an instance of a particular class or a subclass of that class:
Boolean b; String str = "foo"; b = ( str instanceof String ); // true b = ( str instanceof Object ); // also true b = ( str instanceof Date ); // false--not a Date or subclass
instanceof also correctly reports if an object is of the type of an arry or a specified interface.
if ( foo instanceof byte[] )
...
(See Chapter 5, Objects in Java for a full discussion of interfaces.)
It is also important to note that the value null is not considered an instance of any object. So the following test will return false, no matter what the declared type of the variable:
String s = null;
if ( s istanceof String )
// won't happen
Java's roots are in embedded systems--software that runs inside specialized devices like hand-held computers, cellular phones, and fancy toasters. In those kinds of applications, it's especially important that software errors be handled properly. Most users would agree that it's unacceptable for their phone to simply crash or for their toast (and perhaps their house) to burn because their software failed. Given that we can't eliminate the possibility of software errors, a step in the right direction is to at least try to recognize and deal with the application-level errors that we can anticipate in a methodical and systematic way.
Dealing with errors in a language like C is the responsibility of the programmer. There is no help from the language itself in identifying error types, and there are no tools for dealing with them easily. In C and C++, a routine generally indicates a failure by returning an "unreasonable" value (e.g., the idiomatic -1 or null). As the programmer, you must know what constitutes a bad result, and what it means. It's often awkward to work around the limitations of passing error values in the normal path of data flow.[3] An even worse problem is that certain types of errors can legitimately occur almost anywhere, and it's prohibitive and unreasonable to explicitly test for them at every point in the software.
[3] The somewhat obscure setjmp() and longjmp() statements in C can save a point in the execution of code and later return to it unconditionally from a deeply buried location. In a limited sense, this is the functionality of exceptions in Java.
Java offers an elegant solution to these problems with exception handling. (Java exception handling is similar to, but not quite the same as, exception handling in C++.) An exception indicates an unusual condition or an error condition. Program control becomes unconditionally transferred or thrown to a specially designated section of code where it's caught and handled. In this way, error handling is somewhat orthogonal to the normal flow of the program. We don't have to have special return values for all our methods; errors are handled by a separate mechanism. Control can be passed long distance from a deeply nested routine and handled in a single location when that is desirable, or an error can be handled immediately at its source. There are still a few methods that return -1 as a special value, but these are limited to situations in which there isn't really an error.[4]
[4] For example, the getHeight() method of the Image class returns -1 if the height isn't known yet. No error has occurred; the height will be available in the future. In this situation, throwing an exception would be inappropriate.
A Java method is required to specify the exceptions it can throw (i.e., the ones that it doesn't catch itself); this means that the compiler can make sure we handle them. In this way, the information about what errors a method can produce is promoted to the same level of importance as its argument and return types. You may still decide to punt and ignore obvious errors, but in Java you must do so explicitly.
Exceptions are represented by instances of the class java.lang.Exception and its subclasses. Subclasses of Exception can hold specialized information (and possibly behavior) for different kinds of exceptional conditions. However, more often they are simply "logical" subclasses that exist only to serve as a new exception type (more on that later). Figure 4.1 shows the subclasses of Exception; these classes are defined in various packages in the Java API, as indicated in the diagram.
An Exception object is created by the code at the point where the error condition arises. It can hold whatever information is necessary to describe the exceptional condition, including a full stack trace for debugging. The exception object is passed, along with the flow of control, to the handling block of code. This is where the terms "throw" and "catch" come from: the Exception object is thrown from one point in the code and caught by the other, where execution resumes.
The Java API also defines the java.lang.Error class for eggregious or unrecoverable errors. The subclasses of Error are shown in Figure 4.2. You needn't worry about these errors (i.e., you do not have to catch them); they normally indicate linkage problems or virtual machine errors. An error of this kind usually causes the Java interpreter to display a message and exit.
The try/catch guarding statements wrap a block of code and catch designated types of exceptions that occur within it:
try {
readFromFile("foo");
...
}
catch ( Exception e ) {
// Handle error
System.out.println( "Exception while reading file: " + e );
...
}
In the above example, exceptions that occur within the body of the try statement are directed to the catch clause for possible handling. The catch clause acts like a method; it specifies an argument of the type of exception it wants to handle, and, if it's invoked, the Exception object is passed into its body as an argument. Here we receive the object in the variable e and print it along with a message.
A try statement can have multiple catch clauses that specify different specific types (subclasses) of Exception:
try {
readFromFile("foo");
...
}
catch ( FileNotFoundException e ) {
// Handle file not found
...
}
catch ( IOException e ) {
// Handle read error
...
}
catch ( Exception e ) {
// Handle all other errors
...
}
The catch clauses are evaluated in order, and the first possible (assignable) match is taken. At most one catch clause is executed, which means that the exceptions should be listed from most specific to least. In the above example, we'll assume that the hypothetical readFromFile() can throw two different kinds of exceptions: one that indicates the file is not found; the other indicates a more general read error. Any subclass of Exception is assignable to the parent type Exception, so the third catch clause acts like the default clause in a switch statement and handles any remaining possibilities.
It should be obvious, but one beauty of the try/catch statement is that any statement in the try block can assume that all previous statements in the block succeeded. A problem won't arise suddenly because a programmer forgot to check the return value from some method. If an earlier statement fails, execution jumps immediately to the catch clause; later statements are never executed.
What if we hadn't caught the exception? Where would it have gone? Well, if there is no enclosing try/catch statement, the exception pops to the top of the method in which it appeared and is, in turn, thrown from that method. In this way, the exception bubbles up until it's caught, or until it pops out of the top of the program, terminating it with a run-time error message. There's a bit more to it than that because, in this case, the compiler would have reminded us to deal with it, but we'll get back to that in a moment.
Let's look at another example. In Figure 4.3, the method getContent() invokes the method openConnection() from within a try/catch statement. openConnection(), in turn, invokes the method sendRequest(), which calls the method write() to send some data.
In this figure, the second call to write() throws an IOException. Since sendRequest() doesn't contain a try/catch statement to handle the exception, it's thrown again, from the point that it was called in the method openConnection(). Since openConnection() doesn't catch the exception either, it's thrown once more. Finally it's caught by the try statement in getContent() and handled by its catch clause.
Since an exception can bubble up quite a distance before it is caught and handled, we may need a way to determine exactly where it was thrown. All exceptions can dump a stack trace that lists their method of origin and all of the nested method calls that it took to arrive there, using the printStackTrace() method.
try {
// complex task
} catch ( Exception e ) {
// dump information about exactly where the exception ocurred
e.printStackTrack( System.err );
...
}
I mentioned earlier that Java makes us be explicit about our error handling. But Java is programmer-friendly, and it's not possible to require that every conceivable type of error be handled in every situation. So, Java exceptions are divided into two categories: checked exceptions and unchecked exceptions. Most application level exceptions are checked, which means that any method that throws one, either by generating it itself (as we'll discuss below) or by passively ignoring one that occurs within it, must declare that it can throw that type of exception in a special throws clause in its method declaration. We haven't yet talked in detail about declaring methods; we'll cover that in Chapter 5, Objects in Java. For now all you need know is that methods have to declare the checked exceptions they can throw or allow to be thrown.
Again in Figure 4.3, notice that the methods openConnection() and sendRequest() both specify that they can throw an IOException. If we had to throw multiple types of exceptions we could declare them separated with commas:
void readFile( String s ) throws IOException, InterruptedException {
...
}
The throws clause tells the compiler that a method is a possible source of that type of checked exception and that anyone calling that method must be prepared to deal with it. The caller may use a try/catch block to catch it, or it may, itself, declare that it can throw the exception.
Exceptions that are subclasses of the java.lang.RuntimeException class are unchecked. See Figure 4.1 for the subclasses of RuntimeException. It's not a compile-time error to ignore the possibility of these exceptions being thrown; additionally, methods don't have to declare they can throw them. In all other respects, run-time exceptions behave the same as other exceptions. We are perfectly free to catch them if we wish; we simply aren't required to.
Exceptions a reasonable application should try to handle gracefully.
Exceptions from which we would not normally expect our software to try to recover.
The category of checked exceptions includes application-level problems like missing files and unavailable hosts. As good programmers (and upstanding citizens), we should design software to recover gracefully from these kinds of conditions. The category of unchecked exceptions includes problems such as "out of memory" and "array index out of bounds." While these may indicate application-level programming errors, they can occur almost anywhere and aren't generally easy to recover from. Fortunately, because there are unchecked exceptions, you don't have to wrap every one of your array-index operations in a try/catch statement.
We can throw our own exceptions: either instances of Exception or one of its predefined subclasses, or our own specialized subclasses. All we have to do is create an instance of the Exception and throw it with the throw statement:
throw new Exception();
Execution stops and is transferred to the nearest enclosing try/catch statement. (Note that there is little point in keeping a reference to the Exception object we've created here.) An alternative constructor of the Exception class lets us specify a string with an error message:
throw new Exception("Something really bad happened");
By convention, all types of Exception have a String constructor like this. Note that the String message above is somewhat facetious and vague. Normally you won't be throwing a plain old Exception, but a more specific subclass. For example:
public void checkRead( String s ) {
if ( new File(s).isAbsolute() || (s.indexOf("..") != -1) )
throw new SecurityException(
x"Access to file : "+ s +" denied.");
}
In the above, we partially implement a method to check for an illegal path. If we find one, we throw a SecurityException, with some information about the transgression.
Of course, we could include whatever other information is useful in our own specialized subclasses of Exception (or SecurityException). Often though, just having a new type of exception is good enough, because it's sufficient to help direct the flow of control. For example, if we are building a parser, we might want to make our own kind of exception to indicate a particular kind of failure.
class ParseException extends Exception {
ParseException() {
super(); }
ParseException( String desc ) {
super( desc ) };
}
See Chapter 5, Objects in Java for a full description of classes and class constructors. The body of our exception class here simply allows a ParseException to be created in the conventional ways that we have created exceptions above. Now that we have our new exception type, we we might guard for it in the following kind of situation:
// Somewhere in our code
...
try {
parseStream( input );
} catch ( ParseException pe ) {
// Bad input...
} catch ( IOException ioe ) {
// Low level communications problem
}
As you can see, although our new exception doesn't currently hold any specialized information about the problem (it certainly could), it does let us distinguish a parse error from an arbitrary communications error in the same chunk of code. You might call this kind of specialization of an exception to be making a "logical" exception.
Sometimes you'll want to take some action based on an exception and then turn around and throw a new exception in its place. For example, suppose that we want to handle an IOException by freeing up some resources before allowing the failure to pass on to the rest of the application. You can do this in the obvious way, by simply catching the exception and then throwing it again or throwing a new one.
*** I was going to say something about fillInStackTrack() here ***
The try statement imposes a condition on the statements they guard. It says that if an exception occurs within it, the remaining statements will be abandoned. This has consequences for local variable initialization. If the compiler can't determine whether a local variable assignment we placed inside a try/catch block will happen, it won't let us use the variable:
void myMethod() {
int foo;
try {
foo = getResults();
}
catch ( Exception e ) {
...
}
int bar = foo; // Compile time error--foo may not
// have been initialized
In the above example, we can't use foo in the indicated place because there's a chance it was never assigned a value. One obvious option is to move the assignment inside the try statement:
try {
foo = getResults();
int bar = foo; // Okay because we only get here
// if previous assignment succeeds
}
catch ( Exception e ) {
...
}
Sometimes this works just fine. However, now we have the same problem if we want to use bar later in myMethod(). If we're not careful, we might end up pulling everything into the try statement. The situation changes if we transfer control out of the method in the catch clause:
try {
foo = getResults();
}
catch ( Exception e ) {
...
return;
}
int bar = foo; // Okay because we only get here
// if previous assignment succeeds
Your code will dictate its own needs; you should just be aware of the options.
What if we have some clean up to do before we exit our method from one of the catch clauses? To avoid duplicating the code in each catch branch and to make the cleanup more explicit, Java supplies the finally clause. A finally clause can be added after a try and any associated catch clauses. Any statements in the body of the finally clause are guaranteed to be executed, no matter why control leaves the try body:
try {
// Do something here
}
catch ( FileNotFoundException e ) {
...
}
catch ( IOException e ) {
...
}
catch ( Exception e ) {
...
}
finally {
// Cleanup here
}
In the above example the statements at the cleanup point will be executed eventually, no matter how control leaves the try. If control transfers to one of the catch clauses, the statements in finally are executed after the catch completes. If none of the catch clauses handles the exception, the finally statements are executed before the exception propagates to the next level.
If the statements in the try execute cleanly, or even if we perform a return, break, or continue, the statements in the finally clause are executed. To perform cleanup operations, we can even use try and finally without any catch clauses:
try {
// Do something here
return;
}
finally {
System.out.println("Whoo-hoo!");
}
Exceptions that occur in a catch or finally clause are handled normally; the search for an enclosing try/catch begins outside the offending try statement.
An array is a special type of object that can hold an ordered collection of elements. The type of the elements of the array is called the base type of the array; the number of elements it holds is a fixed attribute called its length. (For a collection with a variable length, see the discussion of Vector objects in Chapter 7, Basic Utility Classes.) Java supports arrays of all numeric and reference types.
The basic syntax of arrays looks much like that of C or C++. We create an array of a specified length and access the elements with the special index operator, []. Unlike other languages, however, arrays in Java are true, first-class objects, which means they are real objects within the Java language. An array is an instance of a special Java array class and has a corresponding type in the type system. This means that to use an array, as with any other object, we first declare a variable of the appropriate type and then use the new operator to create an instance of it.
Array objects differ from other objects in Java in three respects:
An array type variable is denoted by a base type followed by empty brackets []. Alternatively, Java accepts a C-style declaration, with the brackets placed after the array name. The following are equivalent:
int [] arrayOfInts; int arrayOfInts [];
In each case, arrayOfInts is declared as an array of integers. The size of the array is not yet an issue, because we are declaring only the array type variable. We have not yet created an actual instance of the array class, with its associated storage. It's not even possible to specify the length of an array as part of its type.
An array of objects can be created in the same way:
String [] someStrings; Button someButtons [];
Having declared an array type variable, we can now use the new operator to create an instance of the array. After the new operator, we specify the base type of the array and its length, with a bracketed integer expression:
arrayOfInts = new int [42]; someStrings = new String [ number + 2 ];
We can, of course, combine the steps of declaring and allocating the array:
double [] someNumbers = new double [20]; Component widgets [] = new Component [12];
As in C, array indices start with zero. Thus, the first element of someNumbers [] is 0 and the last element is 19. After creation, the array elements are initialized to the default values for their type. For numeric types, this means the elements are initially zero:
int [] grades = new int [30]; grades[0] = 99; grades[1] = 72; // grades[2] == 0
The elements of an array of objects are references to the objects, not actual instances of the objects. The default value of each element is therefore null, until we assign instances of appropriate objects:
String names [] = new String [4]; names [0] = new String(); names [1] = "Boofa"; names [2] = someObject.toString(); // names[3] == null
This is an important distinction that can cause confusion. In many other languages, the act of creating an array is the same as allocating storage for its elements. In Java, an array of objects actually contains only reference variables and those variables, have the value null until they are assigned to real objects.[5] Figure 4.4 illustrates the names array of the previous example:
[5] The analog in C or C++ would be an array of pointers to objects. However, pointers in C or C++ are themselves two- or four-byte values. Allocating an array of pointers is, in actuality, allocating the storage for some number of those pointer objects. An array of references is conceptually similar, although references are not themselves objects. We can't manipulate references or parts of references other than by assignment, and their storage requirements (or lack thereof) are not part of the high-level language specification.
names is a variable of type String[] (i.e., a string array). The String[] object can be thought of as containing four String type variables. We have assigned String objects to the first three array elements. The fourth has the default value null.
Java supports the C-style curly braces {} construct for creating an array and initializing its elements when it is declared:
int [] primes = { 1, 2, 3, 5, 7, 7+4 }; // primes[2] == 3
An array object of the proper type and length is implicitly created and the values of the comma-separated list of expressions are assigned to its elements.
We can use the {} syntax with an array of objects. In this case, each of the expressions must evaluate to an object that can be assigned to a variable of the base type of the array, or the value null. Here are some examples:
String [] verbs = { "run", "jump", someWord.toString() };
Button [] controls = { stopButton, new Button("Forwards"),
new Button("Backwards") };
// all types are subtypes of Object
Object [] objects = { stopButton, "A word", null };
You should create and initialize arrays in whatever manner is appropriate for your application. The following are equivalent:
Button [] threeButtons = new Button [3];
Button [] threeButtons = { null, null, null };
The size of an array object is available in the public variable length:
char [] alphabet = new char [26];
int alphaLen = alphabet.length; // alphaLen == 26
String [] musketeers = { "one", "two", "three" };
int num = musketeers.length; // num == 3
length is the only accessible field of an array; it is a variable, not a method.
Array access in Java is just like array access in C; you access an element by putting an integer-valued expression between brackets after the name of the array. The following example creates an array of Button objects called keyPad and then fills the array with Button objects:
Button [] keyPad = new Button [ 10 ]; for ( int i=0; i < keyPad.length; i++ ) keyPad[ i ] = new Button( Integer.toString( i ) );
Attempting to access an element that is outside the range of the array generates an ArrayIndexOutOfBoundsException. This is a type of RuntimeException, so you can either catch it and handle it yourself, or ignore it, as we already discussed:
String [] states = new String [50];
try {
states[0] = "California";
states[1] = "Oregon";
...
states[50] = "McDonald's Land"; // Error--array out of bounds
}
catch ( ArrayIndexOutOfBoundsException err ) {
System.out.println( "Handled error: " + err.getMessage() );
}
It's a common task to copy a range of elements from one array into another. Java supplies the arraycopy() method for this purpose; it's a utility method of the System class:
System.arraycopy( source, sourceStart, destination,
destStart, length );
The following example doubles the size of the names array from an earlier example:
String [] tmpVar = new String [ 2 * names.length ]; System.arraycopy( names, 0, tmpVar, 0, names.length ); names = tmpVar;
A new array, twice the size of names, is allocated and assigned to a temporary variable tmpVar. arraycopy() is used to copy the elements of names to the new array. Finally, the new array is assigned to names. If there are no remaining references to the old array object after names has been copied, it will be garbage collected on the next pass.
You often want to create "throw-away" arrays: arrays that are only used in one place, and never referenced anywhere else. Such arrays don't need to have a name, because you never need to refer to them again in that context. For example, you may want to create a collection of objects to pass as an argument to some method. It's easy enough to create a normal, named array--but if you don't actually work with the array (if you use the array only as a holder for some collection), you shouldn't have to. Java makes it easy to create "anonymous" (i.e., unnamed) arrays.
Let's say you need to call a method named setPets(), which takes an array of Animal objects as arguments. Cat and Dog are subclasses of Animal. Here's how to call setPets() using an anonymous array:
Dog pokey = new Dog ("gray");
Cat squiggles = new Cat ("black");
Cat jasmine = new Cat ("orange");
setPets ( new Animal [] { pokey, squiggles, jasmine });
The syntax looks just like the initialization of an array in a variable declaration. We implicitly define the size of the array and fill in its elements using the curly brace notation. However, since this is not a variable declaration we have to explicitly use the new operator to create the array object.
You can use anonymous arrays to simulate variable length argument lists (often called VARARGS), a feature of many programming languages that Java doesn't provide. The advantage of anonymous arrays over variable length argument lists is that it allows stricter type checking; the compiler always knows exactly what arguments are expected, and therefore can verify that method calls are correct.
Java supports multidimensional arrays in the form of arrays of array type objects. You create a multidimensional array with C-like syntax, using multiple bracket pairs, one for each dimension. You also use this syntax to access elements at various positions within the array. Here's an example of a multidimensional array that represents a chess board:
ChessPiece [][] chessBoard; chessBoard = new ChessPiece [8][8]; chessBoard[0][0] = new ChessPiece( "Rook" ); chessBoard[1][0] = new ChessPiece( "Pawn" ); ...
Here chessBoard is declared as a variable of type ChessPiece[][] (i.e., an array of ChessPiece arrays). This declaration implicitly creates the type ChessPiece[] as well. The example illustrates the special form of the new operator used to create a multidimensional array. It creates an array of ChessPiece[] objects and then, in turn, creates each array of ChessPiece objects. We then index chessBoard to specify values for particular ChessPiece elements. (We'll neglect the color of the pieces here.)
Of course, you can create arrays of with more than two dimensions. Here's a slightly impractical example:
Color [][][] rgbCube = new Color [256][256][256]; rgbCube[0][0][0] = Color.black; rgbCube[255][255][0] = Color.yellow; ...
As in C, we can specify the initial index of a multidimensional array to get an array type object with fewer dimensions. In our example, the variable chessBoard is of type ChessPiece[][]. The expression chessBoard[0] is valid and refers to the first element of chessBoard, which is of type ChessPiece[]. For example, we can create a row for our chess board:
ChessPiece [] startRow = {
new ChessPiece("Rook"), new ChessPiece("Knight"),
new ChessPiece("Bishop"), new ChessPiece("King"),
new ChessPiece("Queen"), new ChessPiece("Bishop"),
new ChessPiece("Knight"), new ChessPiece("Rook")
};
chessBoard[0] = startRow;
We don't necessarily have to specify the dimension sizes of a multidimensional array with a single new operation. The syntax of the new operator lets us leave the sizes of some dimensions unspecified. The size of at least the first dimension (the most significant dimension of the array) has to be specified, but the sizes of any number of the less significant array dimensions may be left undefined. We can assign appropriate array type values later.
We can create a checkerboard of boolean values (which is not quite sufficient for a real game of checkers) using this technique:
boolean [][] checkerBoard; checkerBoard = new boolean [8][];
Here, checkerBoard is declared and created, but its elements, the eight boolean[] objects of the next level, are left empty. Thus, for example, checkerBoard[0] is null until we explicitly create an array and assign it, as follows:
checkerBoard[0] = new boolean [8]; checkerBoard[1] = new boolean [8]; ... checkerBoard[7] = new boolean [8];
The code of the previous two examples is equivalent to:
boolean [][] checkerBoard = new boolean [8][8];
One reason we might want to leave dimensions of an array unspecified is so that we can store arrays given to us by another method.
Note that since the length of the array is not part of its type, the arrays in the checkerboard do not necessarily have to be of the same length. Here's a defective (but perfectly legal) checkerboard:
checkerBoard[2] = new boolean [3]; checkerBoard[3] = new boolean [10];
Since Java implements multidimensional arrays as arrays of arrays, multidimensional arrays do not have to be rectangular. For example, here's how you could create and initialize a triangular array:
int []][] triangle = new int [5][];
for (int i = 0; i < triangle.length; i++) {
triangle[i] = new int [i + 1];
for (int j = 0; j < i + 1; j++)
triangle[i][j] = i + j; }
I said earlier that arrays are instances of special array classes in the Java language. If arrays have classes, where do they fit into the class hierarchy and how are they related? These are good questions; however, we need to talk more about the object-oriented aspects of Java before I can answer them. For now, take it on faith that arrays fit into the class hierarchy; details are in Chapter 5, Objects in Java.
|
|
|