| Thinking in Java source ref: work2.html |
Its a fairly simple program that has only a fixed quantity of
objects with known lifetimes.
In general, your programs will always be creating new objects based on some criteria
that will be known only at the time the program is running. You wont know until run
time the quantity or even the exact type of the objects you need. To solve the general
programming problem, you need to be able to create any number of objects, anytime,
anywhere. So you cant rely on creating a named reference to hold each one of your
objects:
MyObject myReference;
since youll never know how many of these youll actually need. .
Most languages provide some way to solve this rather essential problem. Java has
several ways to hold objects (or rather, references to objects). The built-in type is the
array, which has been discussed before. Also, the Java utilities library has a reasonably
complete set of container classes (also known as collection classes, but because the Java 2 libraries use
the name Collection to refer to a particular subset of the library, I shall also
use the more inclusive term container). Containers provide sophisticated ways
to hold and even manipulate your objects. .
Most of the necessary introduction to arrays is in the last section of Chapter 4, which
showed how you define and initialize an array. Holding objects is the focus of this
chapter, and an array is just one way to hold objects. But there are a number of other
ways to hold objects, so what makes an array special? .
There are three issues that distinguish arrays from other types
of containers: efficiency, type, and the ability to hold primitives. The array is the most
efficient way that Java provides to store and randomly access a sequence of object
references. The array is a simple linear sequence, which makes element access fast, but
you pay for this speed; when you create an array object, its size is fixed and cannot be
changed for the lifetime of that array object. You might suggest creating an array of a
particular size and then, if you run out of space, creating a new one and moving all the
references from the old one to the new one. This is the behavior of the ArrayList class, which will be studied later in this chapter.
However, because of the overhead of this flexibility, an ArrayList is measurably
less efficient than an array. .
In C++, the vector container class does know the
type of objects it holds, but it has a different drawback when compared with arrays in
Java: The C++ vectors operator[] doesnt do bounds checking, so
you can run past the end.[51] In Java, you get
bounds checking regardless of whether youre using an array or a container;
youll get a RuntimeException if you exceed the bounds.
This type of exception indicates a programmer error, and thus you dont need to check
for it in your code. As an aside, the reason the C++ vector doesnt check
bounds with every access is speed; in Java, you have the constant performance overhead of
bounds checking all the time for both arrays and containers. .
The other generic container classes that will be studied in this chapter, List, Set, and Map,
all deal with objects as if they had no specific type. That is, they treat them as type Object, the root class of all classes in Java. This works fine
from one standpoint: You need to build only one container, and any Java object will go
into that container. (Except for primitives, which can be placed in containers as
constants using the Java primitive wrapper classes, or as changeable values by wrapping in
your own class.) This is the second place where an array is superior to the generic
containers: When you create an array, you create it to hold a specific type (which is
related to the third factoran array can hold primitives, whereas a container
cannot). This means that you get compile-time type checking to prevent you from inserting
the wrong type or mistaking the type that youre extracting. Of course, Java will
prevent you from sending an inappropriate message to an object at either compile time or
run time. So its not riskier one way or the other, its just nicer if the
compiler points it out to you, faster at run time, and theres less likelihood that
the end user will get surprised by an exception. .
For efficiency and type checking, its always worth trying to use an array.
However, when youre solving a more general problem, arrays can be too restrictive.
After looking at arrays, the rest of this chapter will be devoted to the container classes
provided by Java. .
Regardless of what type of array
youre working with, the array identifier is actually a reference to a true object
thats created on the heap. This is the object that holds the references to the other
objects, and it can be created either implicitly, as part of the array initialization
syntax, or explicitly with a new expression. Part of the array object (in fact, the
only field or method you can access) is the read-only length member that tells you
how many elements can be stored in that array object. The [] syntax is the only other access that you
have to the array object. .
The following example shows the various ways that an array can be initialized, and how
the array references can be assigned to different array objects. It also shows that arrays
of objects and arrays of primitives are almost identical in their use. The only difference
is that arrays of objects hold references, but arrays of primitives hold the primitive
values directly. .
//: c11:ArraySize.java
// Initialization & re-assignment of arrays.
import com.bruceeckel.simpletest.*;
class Weeble {} // A small mythical creature
public class ArraySize {
private static Test monitor = new Test();
public static void main(String[] args) {
// Arrays of objects:
Weeble[] a; // Local uninitialized variable
Weeble[] b = new Weeble[5]; // Null references
Weeble[] c = new Weeble[4];
for(int i = 0; i < c.length; i++)
if(c[i] == null) // Can test for null reference
c[i] = new Weeble();
// Aggregate initialization:
Weeble[] d = {
new Weeble(), new Weeble(), new Weeble()
};
// Dynamic aggregate initialization:
a = new Weeble[] {
new Weeble(), new Weeble()
};
System.out.println("a.length=" + a.length);
System.out.println("b.length = " + b.length);
// The references inside the array are
// automatically initialized to null:
for(int i = 0; i < b.length; i++)
System.out.println("b[" + i + "]=" + b[i]);
System.out.println("c.length = " + c.length);
System.out.println("d.length = " + d.length);
a = d;
System.out.println("a.length = " + a.length);
// Arrays of primitives:
int[] e; // Null reference
int[] f = new int[5];
int[] g = new int[4];
for(int i = 0; i < g.length; i++)
g[i] = i*i;
int[] h = { 11, 47, 93 };
// Compile error: variable e not initialized:
//!System.out.println("e.length=" + e.length);
System.out.println("f.length = " + f.length);
// The primitives inside the array are
// automatically initialized to zero:
for(int i = 0; i < f.length; i++)
System.out.println("f[" + i + "]=" + f[i]);
System.out.println("g.length = " + g.length);
System.out.println("h.length = " + h.length);
e = h;
System.out.println("e.length = " + e.length);
e = new int[] { 1, 2 };
System.out.println("e.length = " + e.length);
monitor.expect(new String[] {
"a.length=2",
"b.length = 5",
"b[0]=null",
"b[1]=null",
"b[2]=null",
"b[3]=null",
"b[4]=null",
"c.length = 4",
"d.length = 3",
"a.length = 3",
"f.length = 5",
"f[0]=0",
"f[1]=0",
"f[2]=0",
"f[3]=0",
"f[4]=0",
"g.length = 4",
"h.length = 3",
"e.length = 3",
"e.length = 2"
});
}
} ///:~
The array a is an uninitialized local variable, and the compiler prevents you
from doing anything with this reference until youve properly initialized it. The
array b is initialized to point to an array of Weeble references, but no
actual Weeble objects are ever placed in that array. However, you can still ask
what the size of the array is, since b is pointing to a legitimate object. This
brings up a slight drawback: You cant find out how many elements are actually in
the array, since length tells you only how many elements can be placed in
the array; that is, the size of the array object, not the number of elements it actually
holds. However, when an array object is created, its references are automatically
initialized to null, so you can see whether a particular array slot has an object
in it by checking to see whether its null. Similarly, an array of primitives
is automatically initialized to zero for numeric types, (char)0 for char,
and false for boolean. .
Array c shows the creation of the array object followed by the assignment of Weeble
objects to all the slots in the array. Array d shows the aggregate
initialization syntax that causes the array object to be created (implicitly with new
on the heap, just like for array c) and initialized with Weeble
objects, all in one statement. .
The next array initialization could be
thought of as a dynamic aggregate initialization. The aggregate initialization
used by d must be used at the point of ds definition, but with the
second syntax you can create and initialize an array object anywhere. For example, suppose
hide( ) is a method that takes an array of Weeble objects. You could
call it by saying:
hide(d);
but you can also dynamically create the array you want to pass as the argument:
hide(new Weeble[] { new Weeble(), new Weeble() });
In many situations this syntax provides a more convenient way to write code. .
The expression:
a = d;
shows how you can take a reference thats attached to one array object and assign
it to another array object, just as you can do with any other type of object reference.
Now both a and d are pointing to the same array object on the heap. .
The second part of ArraySize.java shows that primitive arrays work just like
object arrays except that primitive arrays hold the primitive values directly. .
Container classes can hold only
references to Objects. An array, however, can be created to hold primitives
directly, as well as references to Objects. It is possible to use the
wrapper classes, such as Integer, Double, etc., to place
primitive values inside a container, but the wrapper classes for primitives can be awkward
to use. In addition, its much more efficient to create and access an array of
primitives than a container of wrapped primitives. .
Of course, if youre using a primitive type and you need the flexibility of a
container that automatically expands when more space is needed, the array wont work,
and youre forced to use a container of wrapped primitives. You might think that
there should be a specialized type of ArrayList for each of the primitive data
types, but Java doesnt provide this for you.[52]
.
Suppose youre writing a method and you dont just want to return just one
thing, but a whole bunch of things. Languages like C and C++ make this difficult because
you cant just return an array, only a pointer to an array. This introduces problems
because it becomes messy to control the lifetime of the array, which easily leads to
memory leaks. .
Java takes a similar approach, but you
just return an array. Unlike C++, with Java you never worry about
responsibility for that arrayit will be around as long as you need it, and the
garbage collector will clean it up when youre done. .
As an example, consider returning an array of String:
//: c11:IceCream.java
// Returning arrays from methods.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class IceCream {
private static Test monitor = new Test();
private static Random rand = new Random();
public static final String[] flavors = {
"Chocolate", "Strawberry", "Vanilla Fudge Swirl",
"Mint Chip", "Mocha Almond Fudge", "Rum Raisin",
"Praline Cream", "Mud Pie"
};
public static String[] flavorSet(int n) {
String[] results = new String[n];
boolean[] picked = new boolean[flavors.length];
for(int i = 0; i < n; i++) {
int t;
do
t = rand.nextInt(flavors.length);
while(picked[t]);
results[i] = flavors[t];
picked[t] = true;
}
return results;
}
public static void main(String[] args) {
for(int i = 0; i < 20; i++) {
System.out.println(
"flavorSet(" + i + ") = ");
String[] fl = flavorSet(flavors.length);
for(int j = 0; j < fl.length; j++)
System.out.println("\t" + fl[j]);
monitor.expect(new Object[] {
"%% flavorSet\\(\\d+\\) = ",
new TestExpression("%% \\t(Chocolate|Strawberry|"
+ "Vanilla Fudge Swirl|Mint Chip|Mocha Almond "
+ "Fudge|Rum Raisin|Praline Cream|Mud Pie)", 8)
});
}
}
} ///:~
The method flavorSet( ) creates an array of String called results.
The size of this array is n, determined by the argument that you pass into the
method. Then it proceeds to choose flavors randomly from the array flavors and
place them into results, which it finally returns. Returning an array is just like
returning any other objectits a reference. Its not important that the
array was created within flavorSet( ), or that the array was created anyplace
else, for that matter. The garbage collector takes care of cleaning up the array when
youre done with it, and the array will persist for as long as you need it. .
As an aside, notice that when flavorSet( ) chooses flavors randomly, it
ensures that a particular choice hasnt already been selected. This is performed in a
do loop that keeps making random choices until it finds one not already in the picked
array. (Of course, a String comparison also could have been performed to see if the
random choice was already in the results array.) If its successful, it adds
the entry and finds the next one (i gets incremented). .
main( ) prints out 20 full sets of flavors, so you can see that flavorSet( )
chooses the flavors in a random order each time. Its easiest to see this if you
redirect the output into a file. And while youre looking at the file, remember that
you just want the ice cream, you dont need it. .
In java.util, youll find the Arrays class,
which holds a set of static methods that perform utility functions for arrays.
There are four basic methods: equals( ), to compare two arrays for equality; fill( ),
to fill an array with a value; sort( ), to sort the array; and binarySearch( ),
to find an element in a sorted array. All of these methods are overloaded for all the
primitive types and Objects. In addition, theres a single asList( )
method that takes any array and turns it into a List container, which youll
learn about later in this chapter. .
Although useful, the Arrays class stops short of being fully functional. For
example, it would be nice to be able to easily print the elements of an array without
having to code a for loop by hand every time. And as youll see, the fill( )
method only takes a single value and places it in the array, so if you wanted, for
example, to fill an array with randomly generated numbers, fill( ) is no help.
.
Thus it makes sense to supplement the Arrays class with some additional
utilities, which will be placed in the package com.bruceeckel.util for
convenience. These will print an array of any type and fill an array with values or
objects that are created by an object called a generator that you can define. .
Because code needs to be created for each primitive type as well
as Object, theres a lot of nearly duplicated code.[53] For example, a generator interface is required for
each type because the return type of next( ) must be different in each case: .
//: com:bruceeckel:util:Generator.java
package com.bruceeckel.util;
public interface Generator { Object next(); } ///:~
//: com:bruceeckel:util:BooleanGenerator.java
package com.bruceeckel.util;
public interface BooleanGenerator { boolean next(); } ///:~
//: com:bruceeckel:util:ByteGenerator.java
package com.bruceeckel.util;
public interface ByteGenerator { byte next(); } ///:~
//: com:bruceeckel:util:CharGenerator.java
package com.bruceeckel.util;
public interface CharGenerator { char next(); } ///:~
//: com:bruceeckel:util:ShortGenerator.java
package com.bruceeckel.util;
public interface ShortGenerator { short next(); } ///:~
//: com:bruceeckel:util:IntGenerator.java
package com.bruceeckel.util;
public interface IntGenerator { int next(); } ///:~
//: com:bruceeckel:util:LongGenerator.java
package com.bruceeckel.util;
public interface LongGenerator { long next(); } ///:~
//: com:bruceeckel:util:FloatGenerator.java
package com.bruceeckel.util;
public interface FloatGenerator { float next(); } ///:~
//: com:bruceeckel:util:DoubleGenerator.java
package com.bruceeckel.util;
public interface DoubleGenerator { double next(); } ///:~
Arrays2 contains a variety of toString( )
methods, overloaded for each type. These methods allow you to easily print an array. The toString( )
code introduces the use of StringBuffer instead of String objects. This is a
nod to efficiency; when youre assembling a string in a method that might be called a
lot, its wiser to use the more efficient StringBuffer rather than the more
convenient String operations. Here, the StringBuffer is created with an
initial value, and Strings are appended. Finally, the result is converted to
a String as the return value: .
//: com:bruceeckel:util:Arrays2.java
// A supplement to java.util.Arrays, to provide additional
// useful functionality when working with arrays. Allows
// any array to be converted to a String, and to be filled
// via a user-defined "generator" object.
package com.bruceeckel.util;
import java.util.*;
public class Arrays2 {
public static String toString(boolean[] a) {
StringBuffer result = new StringBuffer("[");
for(int i = 0; i < a.length; i++) {
result.append(a[i]);
if(i < a.length - 1)
result.append(", ");
}
result.append("]");
return result.toString();
}
public static String toString(byte[] a) {
StringBuffer result = new StringBuffer("[");
for(int i = 0; i < a.length; i++) {
result.append(a[i]);
if(i < a.length - 1)
result.append(", ");
}
result.append("]");
return result.toString();
}
public static String toString(char[] a) {
StringBuffer result = new StringBuffer("[");
for(int i = 0; i < a.length; i++) {
result.append(a[i]);
if(i < a.length - 1)
result.append(", ");
}
result.append("]");
return result.toString();
}
public static String toString(short[] a) {
StringBuffer result = new StringBuffer("[");
for(int i = 0; i < a.length; i++) {
result.append(a[i]);
if(i < a.length - 1)
result.append(", ");
}
result.append("]");
return result.toString();
}
public static String toString(int[] a) {
StringBuffer result = new StringBuffer("[");
for(int i = 0; i < a.length; i++) {
result.append(a[i]);
if(i < a.length - 1)
result.append(", ");
}
result.append("]");
return result.toString();
}
public static String toString(long[] a) {
StringBuffer result = new StringBuffer("[");
for(int i = 0; i < a.length; i++) {
result.append(a[i]);
if(i < a.length - 1)
result.append(", ");
}
result.append("]");
return result.toString();
}
public static String toString(float[] a) {
StringBuffer result = new StringBuffer("[");
for(int i = 0; i < a.length; i++) {
result.append(a[i]);
if(i < a.length - 1)
result.append(", ");
}
result.append("]");
return result.toString();
}
public static String toString(double[] a) {
StringBuffer result = new StringBuffer("[");
for(int i = 0; i < a.length; i++) {
result.append(a[i]);
if(i < a.length - 1)
result.append(", ");
}
result.append("]");
return result.toString();
}
// Fill an array using a generator:
public static void fill(Object[] a, Generator gen) {
fill(a, 0, a.length, gen);
}
public static void
fill(Object[] a, int from, int to, Generator gen) {
for(int i = from; i < to; i++)
a[i] = gen.next();
}
public static void
fill(boolean[] a, BooleanGenerator gen) {
fill(a, 0, a.length, gen);
}
public static void
fill(boolean[] a, int from, int to,BooleanGenerator gen){
for(int i = from; i < to; i++)
a[i] = gen.next();
}
public static void fill(byte[] a, ByteGenerator gen) {
fill(a, 0, a.length, gen);
}
public static void
fill(byte[] a, int from, int to, ByteGenerator gen) {
for(int i = from; i < to; i++)
a[i] = gen.next();
}
public static void fill(char[] a, CharGenerator gen) {
fill(a, 0, a.length, gen);
}
public static void
fill(char[] a, int from, int to, CharGenerator gen) {
for(int i = from; i < to; i++)
a[i] = gen.next();
}
public static void fill(short[] a, ShortGenerator gen) {
fill(a, 0, a.length, gen);
}
public static void
fill(short[] a, int from, int to, ShortGenerator gen) {
for(int i = from; i < to; i++)
a[i] = gen.next();
}
public static void fill(int[] a, IntGenerator gen) {
fill(a, 0, a.length, gen);
}
public static void
fill(int[] a, int from, int to, IntGenerator gen) {
for(int i = from; i < to; i++)
a[i] = gen.next();
}
public static void fill(long[] a, LongGenerator gen) {
fill(a, 0, a.length, gen);
}
public static void
fill(long[] a, int from, int to, LongGenerator gen) {
for(int i = from; i < to; i++)
a[i] = gen.next();
}
public static void fill(float[] a, FloatGenerator gen) {
fill(a, 0, a.length, gen);
}
public static void
fill(float[] a, int from, int to, FloatGenerator gen) {
for(int i = from; i < to; i++)
a[i] = gen.next();
}
public static void fill(double[] a, DoubleGenerator gen){
fill(a, 0, a.length, gen);
}
public static void
fill(double[] a, int from, int to, DoubleGenerator gen) {
for(int i = from; i < to; i++)
a[i] = gen.next();
}
private static Random r = new Random();
public static class
RandBooleanGenerator implements BooleanGenerator {
public boolean next() { return r.nextBoolean(); }
}
public static class
RandByteGenerator implements ByteGenerator {
public byte next() { return (byte)r.nextInt(); }
}
private static String ssource =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
private static char[] src = ssource.toCharArray();
public static class
RandCharGenerator implements CharGenerator {
public char next() {
return src[r.nextInt(src.length)];
}
}
public static class
RandStringGenerator implements Generator {
private int len;
private RandCharGenerator cg = new RandCharGenerator();
public RandStringGenerator(int length) {
len = length;
}
public Object next() {
char[] buf = new char[len];
for(int i = 0; i < len; i++)
buf[i] = cg.next();
return new String(buf);
}
}
public static class
RandShortGenerator implements ShortGenerator {
public short next() { return (short)r.nextInt(); }
}
public static class
RandIntGenerator implements IntGenerator {
private int mod = 10000;
public RandIntGenerator() {}
public RandIntGenerator(int modulo) { mod = modulo; }
public int next() { return r.nextInt(mod); }
}
public static class
RandLongGenerator implements LongGenerator {
public long next() { return r.nextLong(); }
}
public static class
RandFloatGenerator implements FloatGenerator {
public float next() { return r.nextFloat(); }
}
public static class
RandDoubleGenerator implements DoubleGenerator {
public double next() {return r.nextDouble();}
}
} ///:~
To fill an array of elements using a generator, the fill( ) method takes a
reference to an appropriate generator interface, which has a next( )
method that will somehow produce an object of the right type (depending on how the
interface is implemented). The fill( ) method simply calls next( )
until the desired range has been filled. Now you can create any generator by implementing
the appropriate interface and use your generator with fill( ). .
Random data generators are useful for testing, so a set of inner classes is created to
implement all the primitive generator interfaces, as well as a String generator to
represent Object. You can see that RandStringGenerator uses RandCharGenerator
to fill an array of characters, which is then turned into a String. The size of the
array is determined by the constructor argument. .
To generate numbers that arent too large, RandIntGenerator defaults to a
modulus of 10,000, but the overloaded constructor allows you to choose a smaller value. .
Heres a program to test the library and demonstrate how it is used:
//: c11:TestArrays2.java
// Test and demonstrate Arrays2 utilities.
import com.bruceeckel.util.*;
public class TestArrays2 {
public static void main(String[] args) {
int size = 6;
// Or get the size from the command line:
if(args.length != 0) {
size = Integer.parseInt(args[0]);
if(size < 3) {
System.out.println("arg must be >= 3");
System.exit(1);
}
}
boolean[] a1 = new boolean[size];
byte[] a2 = new byte[size];
char[] a3 = new char[size];
short[] a4 = new short[size];
int[] a5 = new int[size];
long[] a6 = new long[size];
float[] a7 = new float[size];
double[] a8 = new double[size];
Arrays2.fill(a1, new Arrays2.RandBooleanGenerator());
System.out.println("a1 = " + Arrays2.toString(a1));
Arrays2.fill(a2, new Arrays2.RandByteGenerator());
System.out.println("a2 = " + Arrays2.toString(a2));
Arrays2.fill(a3, new Arrays2.RandCharGenerator());
System.out.println("a3 = " + Arrays2.toString(a3));
Arrays2.fill(a4, new Arrays2.RandShortGenerator());
System.out.println("a4 = " + Arrays2.toString(a4));
Arrays2.fill(a5, new Arrays2.RandIntGenerator());
System.out.println("a5 = " + Arrays2.toString(a5));
Arrays2.fill(a6, new Arrays2.RandLongGenerator());
System.out.println("a6 = " + Arrays2.toString(a6));
Arrays2.fill(a7, new Arrays2.RandFloatGenerator());
System.out.println("a7 = " + Arrays2.toString(a7));
Arrays2.fill(a8, new Arrays2.RandDoubleGenerator());
System.out.println("a8 = " + Arrays2.toString(a8));
}
} ///:~
The size parameter has a default value, but you can also set it from the command
line. .
The Java standard library Arrays also has a fill( ) method, but that
is rather trivial; it only duplicates a single value into each location, or in the case of
objects, copies the same reference into each location. Using Arrays2.toString( ),
the Arrays.fill( ) methods can be easily demonstrated:
//: c11:FillingArrays.java
// Using Arrays.fill()
import com.bruceeckel.simpletest.*;
import com.bruceeckel.util.*;
import java.util.*;
public class FillingArrays {
private static Test monitor = new Test();
public static void main(String[] args) {
int size = 6;
// Or get the size from the command line:
if(args.length != 0)
size = Integer.parseInt(args[0]);
boolean[] a1 = new boolean[size];
byte[] a2 = new byte[size];
char[] a3 = new char[size];
short[] a4 = new short[size];
int[] a5 = new int[size];
long[] a6 = new long[size];
float[] a7 = new float[size];
double[] a8 = new double[size];
String[] a9 = new String[size];
Arrays.fill(a1, true);
System.out.println("a1 = " + Arrays2.toString(a1));
Arrays.fill(a2, (byte)11);
System.out.println("a2 = " + Arrays2.toString(a2));
Arrays.fill(a3, 'x');
System.out.println("a3 = " + Arrays2.toString(a3));
Arrays.fill(a4, (short)17);
System.out.println("a4 = " + Arrays2.toString(a4));
Arrays.fill(a5, 19);
System.out.println("a5 = " + Arrays2.toString(a5));
Arrays.fill(a6, 23);
System.out.println("a6 = " + Arrays2.toString(a6));
Arrays.fill(a7, 29);
System.out.println("a7 = " + Arrays2.toString(a7));
Arrays.fill(a8, 47);
System.out.println("a8 = " + Arrays2.toString(a8));
Arrays.fill(a9, "Hello");
System.out.println("a9 = " + Arrays.asList(a9));
// Manipulating ranges:
Arrays.fill(a9, 3, 5, "World");
System.out.println("a9 = " + Arrays.asList(a9));
monitor.expect(new String[] {
"a1 = [true, true, true, true, true, true]",
"a2 = [11, 11, 11, 11, 11, 11]",
"a3 = [x, x, x, x, x, x]",
"a4 = [17, 17, 17, 17, 17, 17]",
"a5 = [19, 19, 19, 19, 19, 19]",
"a6 = [23, 23, 23, 23, 23, 23]",
"a7 = [29.0, 29.0, 29.0, 29.0, 29.0, 29.0]",
"a8 = [47.0, 47.0, 47.0, 47.0, 47.0, 47.0]",
"a9 = [Hello, Hello, Hello, Hello, Hello, Hello]",
"a9 = [Hello, Hello, Hello, World, World, Hello]"
});
}
} ///:~
You can either fill the entire array or, as the last two statements show, a range of
elements. But since you can only provide a single value to use for filling using Arrays.fill( ),
the Arrays2.fill( ) methods produce much more interesting results. .
The Java standard library provides a static
method, System.arraycopy( ), which can make much
faster copies of an array than if you use a for loop to perform the copy by hand. System.arraycopy( )
is overloaded to handle all types. Heres an example that manipulates arrays of int:
//: c11:CopyingArrays.java
// Using System.arraycopy()
import com.bruceeckel.simpletest.*;
import com.bruceeckel.util.*;
import java.util.*;
public class CopyingArrays {
private static Test monitor = new Test();
public static void main(String[] args) {
int[] i = new int[7];
int[] j = new int[10];
Arrays.fill(i, 47);
Arrays.fill(j, 99);
System.out.println("i = " + Arrays2.toString(i));
System.out.println("j = " + Arrays2.toString(j));
System.arraycopy(i, 0, j, 0, i.length);
System.out.println("j = " + Arrays2.toString(j));
int[] k = new int[5];
Arrays.fill(k, 103);
System.arraycopy(i, 0, k, 0, k.length);
System.out.println("k = " + Arrays2.toString(k));
Arrays.fill(k, 103);
System.arraycopy(k, 0, i, 0, k.length);
System.out.println("i = " + Arrays2.toString(i));
// Objects:
Integer[] u = new Integer[10];
Integer[] v = new Integer[5];
Arrays.fill(u, new Integer(47));
Arrays.fill(v, new Integer(99));
System.out.println("u = " + Arrays.asList(u));
System.out.println("v = " + Arrays.asList(v));
System.arraycopy(v, 0, u, u.length/2, v.length);
System.out.println("u = " + Arrays.asList(u));
monitor.expect(new String[] {
"i = [47, 47, 47, 47, 47, 47, 47]",
"j = [99, 99, 99, 99, 99, 99, 99, 99, 99, 99]",
"j = [47, 47, 47, 47, 47, 47, 47, 99, 99, 99]",
"k = [47, 47, 47, 47, 47]",
"i = [103, 103, 103, 103, 103, 47, 47]",
"u = [47, 47, 47, 47, 47, 47, 47, 47, 47, 47]",
"v = [99, 99, 99, 99, 99]",
"u = [47, 47, 47, 47, 47, 99, 99, 99, 99, 99]"
});
}
} ///:~
The arguments to arraycopy( ) are the source array, the offset into the
source array from whence to start copying, the destination array, the offset into the
destination array where the copying begins, and the number of elements to copy. Naturally,
any violation of the array boundaries will cause an exception. .
The example shows that both primitive arrays and object arrays can be copied. However,
if you copy arrays of objects, then only the references get copiedtheres no
duplication of the objects themselves. This is called a shallow copy (see Appendix
A). .
Arrays provides the overloaded
method equals( ) to compare entire arrays for equality. Again, these are
overloaded for all the primitives and for Object. To be equal, the arrays must have
the same number of elements, and each element must be equivalent to each corresponding
element in the other array, using the equals( ) for each element. (For
primitives, that primitives wrapper class equals( ) is used; for
example, Integer.equals( ) for int.) For example:
//: c11:ComparingArrays.java
// Using Arrays.equals()
import com.bruceeckel.simpletest.*;
import java.util.*;
public class ComparingArrays {
private static Test monitor = new Test();
public static void main(String[] args) {
int[] a1 = new int[10];
int[] a2 = new int[10];
Arrays.fill(a1, 47);
Arrays.fill(a2, 47);
System.out.println(Arrays.equals(a1, a2));
a2[3] = 11;
System.out.println(Arrays.equals(a1, a2));
String[] s1 = new String[5];
Arrays.fill(s1, "Hi");
String[] s2 = {"Hi", "Hi", "Hi", "Hi", "Hi"};
System.out.println(Arrays.equals(s1, s2));
monitor.expect(new String[] {
"true",
"false",
"true"
});
}
} ///:~
Originally, a1 and a2 are exactly equal, so the output is
true, but then one of the elements is changed, which makes the result
false. In the last case, all the elements of s1 point to the same
object, but s2 has five unique objects. However, array equality is based on
contents (via Object.equals( )) , so the result is true. .
One of the missing features in the Java 1.0 and 1.1 libraries
was algorithmic operationseven simple sorting. This was a rather confusing situation
to someone expecting an adequate standard library. Fortunately, Java 2 remedied the
situation, at least for the sorting problem. .
A problem with writing generic sorting code is that sorting must
perform comparisons based on the actual type of the object. Of course, one approach is to
write a different sorting method for every different type, but you should be able to
recognize that this does not produce code that is easily reused for new types. .
A primary goal of programming design is to separate things that change from
things that stay the same, and here, the code that stays the same is the general
sort algorithm, but the thing that changes from one use to the next is the way objects are
compared. So instead of placing the comparison code into many different sort routines, the
technique of the callback is used. With a callback, the part
of the code that varies from case to case is separated, and the part of the code
thats always the same will call back to the code that changes. .
Java has two ways to provide comparison functionality. The first is with the
natural comparison method that is imparted to a class by implementing the java.lang.Comparable
interface. This is a very simple interface with a single method, compareTo( ).
This method takes another Object as an argument and produces a negative value if
the current object is less than the argument, zero if the argument is equal, and a
positive value if the current object is greater than the argument . .
Heres a class that implements Comparable and
demonstrates the comparability by using the Java standard library method Arrays.sort( ):
//: c11:CompType.java
// Implementing Comparable in a class.
import com.bruceeckel.util.*;
import java.util.*;
public class CompType implements Comparable {
int i;
int j;
public CompType(int n1, int n2) {
i = n1;
j = n2;
}
public String toString() {
return "[i = " + i + ", j = " + j + "]";
}
public int compareTo(Object rv) {
int rvi = ((CompType)rv).i;
return (i < rvi ? -1 : (i == rvi ? 0 : 1));
}
private static Random r = new Random();
public static Generator generator() {
return new Generator() {
public Object next() {
return new CompType(r.nextInt(100),r.nextInt(100));
}
};
}
public static void main(String[] args) {
CompType[] a = new CompType[10];
Arrays2.fill(a, generator());
System.out.println(
"before sorting, a = " + Arrays.asList(a));
Arrays.sort(a);
System.out.println(
"after sorting, a = " + Arrays.asList(a));
}
} ///:~
When you define the comparison method, you are responsible for deciding what it means
to compare one of your objects to another. Here, only the i values are used in the
comparison, and the j values are ignored. .
The static randInt( ) method produces positive values between zero and 100,
and the generator( ) method produces an object that implements the Generator
interface by creating an anonymous inner class (see Chapter 8). This builds CompType
objects by initializing them with random values. In main( ), the generator is
used to fill an array of CompType, which is then sorted. If Comparable
hadnt been implemented, then youd get a ClassCastException at run time
when you tried to call sort( ). This is because sort( ) casts its
argument to Comparable. .
Now suppose someone hands you a class that doesnt implement Comparable, or
hands you this class that does implement Comparable, but you decide you
dont like the way it works and would rather have a different comparison method for
the type. The solution is in contrast to hard-wiring the comparison code into each
different object. Instead, the strategy design pattern[54] is used. With a strategy, the part of the code
that varies is encapsulated inside its own class (the strategy object). You hand a
strategy object to the code thats always the same, which uses the strategy to
fulfill its algorithm. That way, you can make different objects to express different ways
of comparison and feed them to the same sorting code. Here, you create a strategy by
defining a separate class that implements an interface called Comparator.
This has two methods, compare( ) and equals( ). However, you
dont have to implement equals( ) except for special performance needs,
because anytime you create a class, it is implicitly inherited from Object, which
has an equals( ). So you can just use the default Object equals( )
and satisfy the contract imposed by the interface. .
The Collections class (which well look at more later) contains a single Comparator
that reverses the natural sorting order. This can be applied easily to the CompType:
.
//: c11:Reverse.java
// The Collecions.reverseOrder() Comparator
import com.bruceeckel.util.*;
import java.util.*;
public class Reverse {
public static void main(String[] args) {
CompType[] a = new CompType[10];
Arrays2.fill(a, CompType.generator());
System.out.println(
"before sorting, a = " + Arrays.asList(a));
Arrays.sort(a, Collections.reverseOrder());
System.out.println(
"after sorting, a = " + Arrays.asList(a));
}
} ///:~
The call to Collections.reverseOrder( ) produces the
reference to the Comparator. .
As a second example, the following Comparator compares CompType objects
based on their j values rather than their i values:
//: c11:ComparatorTest.java
// Implementing a Comparator for a class.
import com.bruceeckel.util.*;
import java.util.*;
class CompTypeComparator implements Comparator {
public int compare(Object o1, Object o2) {
int j1 = ((CompType)o1).j;
int j2 = ((CompType)o2).j;
return (j1 < j2 ? -1 : (j1 == j2 ? 0 : 1));
}
}
public class ComparatorTest {
public static void main(String[] args) {
CompType[] a = new CompType[10];
Arrays2.fill(a, CompType.generator());
System.out.println(
"before sorting, a = " + Arrays.asList(a));
Arrays.sort(a, new CompTypeComparator());
System.out.println(
"after sorting, a = " + Arrays.asList(a));
}
} ///:~
The compare( ) method must return a negative integer, zero, or positive
integer if the first argument is less than, equal to, or greater than the second,
respectively. .
With the built-in sorting methods, you can sort any array of primitives, or any array
of objects that either implements Comparable or has an associated Comparator.
This fills a big hole in the Java libraries; believe it or not, there was no support in
Java 1.0 or 1.1 for sorting Strings! Heres an example that generates random String
objects and sorts them:
//: c11:StringSorting.java
// Sorting an array of Strings.
import com.bruceeckel.util.*;
import java.util.*;
public class StringSorting {
public static void main(String[] args) {
String[] sa = new String[30];
Arrays2.fill(sa, new Arrays2.RandStringGenerator(5));
System.out.println(
"Before sorting: " + Arrays.asList(sa));
Arrays.sort(sa);
System.out.println(
"After sorting: " + Arrays.asList(sa));
}
} ///:~
One thing youll notice about the output in the String sorting algorithm is
that its lexicographic,
so it puts all the words starting with uppercase letters first, followed by all the words
starting with lowercase letters. (Telephone books are typically sorted this way.) You may
also want to group the words together regardless of case, and you can do this by defining
a Comparator class, thereby overriding the default String Comparable behavior.
For reuse, this will be added to the util package: .
//: com:bruceeckel:util:AlphabeticComparator.java
// Keeping upper and lowercase letters together.
package com.bruceeckel.util;
import java.util.*;
public class AlphabeticComparator implements Comparator {
public int compare(Object o1, Object o2) {
String s1 = (String)o1;
String s2 = (String)o2;
return s1.toLowerCase().compareTo(s2.toLowerCase());
}
} ///:~
By casting to String at the beginning, youll get an exception if you
attempt to use this with the wrong type. Each String is converted to lowercase
before the comparison. Strings built-in compareTo( ) method
provides the desired functionality. .
Heres a test using AlphabeticComparator:
//: c11:AlphabeticSorting.java
// Keeping upper and lowercase letters together.
import com.bruceeckel.util.*;
import java.util.*;
public class AlphabeticSorting {
public static void main(String[] args) {
String[] sa = new String[30];
Arrays2.fill(sa, new Arrays2.RandStringGenerator(5));
System.out.println(
"Before sorting: " + Arrays.asList(sa));
Arrays.sort(sa, new AlphabeticComparator());
System.out.println(
"After sorting: " + Arrays.asList(sa));
}
} ///:~
The sorting algorithm thats used in the Java standard library is designed to be
optimal for the particular type youre sortinga Quicksort for primitives, and a
stable merge sort for objects. So you shouldnt need to spend any time worrying about
performance unless your profiler points you to the sorting process as a bottleneck. .
Once an array is sorted, you can perform a fast search for a
particular item by using Arrays.binarySearch( ).
However, its very important that you do not try to use binarySearch( )
on an unsorted array; the results will be unpredictable. The following example uses a RandIntGenerator
to fill an array, and then uses the same generator to produce values to search for: .
//: c11:ArraySearching.java
// Using Arrays.binarySearch().
import com.bruceeckel.util.*;
import java.util.*;
public class ArraySearching {
public static void main(String[] args) {
int[] a = new int[100];
Arrays2.RandIntGenerator gen =
new Arrays2.RandIntGenerator(1000);
Arrays2.fill(a, gen);
Arrays.sort(a);
System.out.println(
"Sorted array: " + Arrays2.toString(a));
while(true) {
int r = gen.next();
int location = Arrays.binarySearch(a, r);
if(location >= 0) {
System.out.println("Location of " + r +
" is " + location + ", a[" +
location + "] = " + a[location]);
break; // Out of while loop
}
}
}
} ///:~
In the while loop, random values are generated as search items until one of them
is found. .
Arrays.binarySearch( ) produces a value greater than or equal to zero if
the search item is found. Otherwise, it produces a negative value representing the place
that the element should be inserted if you are maintaining the sorted array by hand. The
value produced is
-(insertion point) - 1
The insertion point is the index of the first element greater than the key, or a.size( ),
if all elements in the array are less than the specified key. .
If the array contains duplicate elements, there is no guarantee which one will be
found. The algorithm is thus not really designed to support duplicate elements, but rather
to tolerate them. If you need a sorted list of nonduplicated elements, use a TreeSet
(to maintain sorted order) or LinkedHashSet (to maintain insertion order), which
will be introduced later in this chapter. These classes take care of all the details for
you automatically. Only in cases of performance bottlenecks should you replace one of
these classes with a hand-maintained array. .
If you have sorted an object array using a Comparator (primitive arrays do not
allow sorting with a Comparator), you must include that same Comparator when
you perform a binarySearch( ) (using the overloaded version of the method
thats provided). For example, the AlphabeticSorting.java program can be
modified to perform a search:
//: c11:AlphabeticSearch.java
// Searching with a Comparator.
import com.bruceeckel.simpletest.*;
import com.bruceeckel.util.*;
import java.util.*;
public class AlphabeticSearch {
private static Test monitor = new Test();
public static void main(String[] args) {
String[] sa = new String[30];
Arrays2.fill(sa, new Arrays2.RandStringGenerator(5));
AlphabeticComparator comp = new AlphabeticComparator();
Arrays.sort(sa, comp);
int index = Arrays.binarySearch(sa, sa[10], comp);
System.out.println("Index = " + index);
monitor.expect(new String[] {
"Index = 10"
});
}
} ///:~
The Comparator must be passed to the overloaded binarySearch( ) as
the third argument. In this example, success is guaranteed because the search item is
selected from the array itself. .
To summarize what youve seen so far, your first and most efficient choice to hold
a group of objects should be an array, and youre forced into this choice if you want
to hold a group of primitives. In the remainder of this chapter well look at the
more general case, when you dont know at the time youre writing the program
how many objects youre going to need, or if you need a more sophisticated way to
store your objects. Java provides a library of container
classes to solve this problem, the basic types of which are List,
Set, and Map. You can
solve a surprising number of problems by using these tools. .
Among their other characteristicsSet, for example, holds only one object
of each value, and Map is an associative array that
lets you associate any object with any other objectthe Java container classes will
automatically resize themselves. So, unlike arrays, you can put in any number of objects
and you dont need to worry about how big to make the container while youre
writing the program. .
To me, container classes are one of the most powerful tools for raw development because
they significantly increase your programming muscle. The Java 2 containers represent a
thorough redesign[55]
of the rather poor showings in Java 1.0 and 1.1. Some of the redesign makes things tighter
and more sensible. It also fills out the functionality of the containers library,
providing the behavior of linked lists, queues, and deques (double-ended queues,
pronounced decks). .
The design of a
containers library is difficult (true of most library design problems). In C++, the
container classes covered the bases with many different classes. This was better than what
was available prior to the C++ container classes (nothing), but it didnt translate
well into Java. At the other extreme, Ive seen a containers library that consists of
a single class, container, which acts like both a linear sequence and an
associative array at the same time. The Java 2 container library strikes a balance: the
full functionality that you expect from a mature container library, but easier to learn
and use than the C++ container classes and other similar container libraries. The result
can seem a bit odd in places. Unlike some of the decisions made in the early Java
libraries, these oddities were not accidents, but carefully considered decisions based on
trade-offs in complexity. It might take you a little while to get comfortable with some
aspects of the library, but I think youll find yourself rapidly acquiring and using
these new tools. .
The Java 2 container library takes the
issue of holding your objects and divides it into two distinct concepts:
We will first look at the general features of containers, then go into details, and
finally learn why there are different versions of some containers and how to choose
between them. .
Unlike arrays, the containers print nicely without any help. Heres an example
that also introduces you to the basic types of containers:
//: c11:PrintingContainers.java
// Containers print themselves automatically.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class PrintingContainers {
private static Test monitor = new Test();
static Collection fill(Collection c) {
c.add("dog");
c.add("dog");
c.add("cat");
return c;
}
static Map fill(Map m) {
m.put("dog", "Bosco");
m.put("dog", "Spot");
m.put("cat", "Rags");
return m;
}
public static void main(String[] args) {
System.out.println(fill(new ArrayList()));
System.out.println(fill(new HashSet()));
System.out.println(fill(new HashMap()));
monitor.expect(new String[] {
"[dog, dog, cat]",
"[dog, cat]",
"{dog=Spot, cat=Rags}"
});
}
} ///:~
As mentioned before, there are two basic categories in the Java container library. The
distinction is based on the number of items that are held in each location of the
container. The Collection category only holds one item in each location (the name
is a bit misleading, because entire container libraries are often called
collections). It includes the List, which holds a group of items in a
specified sequence, and the Set, which only allows the addition of one item of each
type. The ArrayList is a type of List, and HashSet is a type of Set.
To add items to any Collection, theres an add( ) method. .
The Map holds key-value pairs, rather like a mini database. The preceding
example uses one flavor of Map, the HashMap. If you have a Map
that associates states with their capitals and you want to know the capital of Ohio, you
look it upalmost as if you were indexing into an array. (Maps are also called associative arrays.) To add elements
to a Map, theres a put( ) method that takes a key and a value as
arguments. The example only shows adding elements and does not look up the elements after
theyre added. That will be shown later. .
The overloaded fill( ) methods fill Collections and Maps,
respectively. If you look at the output, you can see that the default printing behavior
(provided via the containers various toString( ) methods) produces quite
readable results, so no additional printing support is necessary as it was with arrays. A Collection
is printed surrounded by square brackets, with each element separated by a comma. A Map
is surrounded by curly braces, with each key and value associated with an equal sign (keys
on the left, values on the right). .
You can also immediately see the basic behavior of the different containers. The List
holds the objects exactly as they are entered, without any reordering or editing. The Set,
however, only accepts one of each object, and it uses its own internal ordering method (in
general, you are only concerned with whether or not something is a member of the Set,
not the order in which it appearsfor that youd use a List). And the Map
also only accepts one of each type of item, based on the key, and it also has its own
internal ordering and does not care about the order in which you enter the items. If
maintaining the insertion sequence is important, you can use a LinkedHashSet or LinkedHashMap. .
Although the problem of printing the containers is taken care of, filling containers
suffers from the same deficiency as java.util.Arrays. Just like Arrays,
there is a companion class called Collections containing static utility
methods, including one called fill( ). This fill( )
also just duplicates a single object reference throughout the container, and also only
works for List objects and not Sets or Maps:
//: c11:FillingLists.java
// The Collections.fill() method.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class FillingLists {
private static Test monitor = new Test();
public static void main(String[] args) {
List list = new ArrayList();
for(int i = 0; i < 10; i++)
list.add("");
Collections.fill(list, "Hello");
System.out.println(list);
monitor.expect(new String[] {
"[Hello, Hello, Hello, Hello, Hello, " +
"Hello, Hello, Hello, Hello, Hello]"
});
}
} ///:~
This method is made even less useful by the fact that it can only replace elements that
are already in the List and will not add new elements. .
To be able to create interesting examples, here is a complementary Collections2
library (part of com.bruceeckel.util for convenience) with a fill( )
method that uses a generator to add elements and allows you to specify the number of
elements you want to add( ). The Generator
interface defined previously will work for Collections, but the Map
requires its own generator interface since a pair of objects (one key and one
value) must be produced by each call to next( ). Here is the Pair
class:
//: com:bruceeckel:util:Pair.java
package com.bruceeckel.util;
public class Pair {
public Object key, value;
public Pair(Object k, Object v) {
key = k;
value = v;
}
} ///:~
Next, the generator interface that produces the Pair:
//: com:bruceeckel:util:MapGenerator.java
package com.bruceeckel.util;
public interface MapGenerator { Pair next(); } ///:~
With these, a set of utilities for working with the container classes can be developed:
//: com:bruceeckel:util:Collections2.java
// To fill any type of container using a generator object.
package com.bruceeckel.util;
import java.util.*;
public class Collections2 {
// Fill an array using a generator:
public static void
fill(Collection c, Generator gen, int count) {
for(int i = 0; i < count; i++)
c.add(gen.next());
}
public static void
fill(Map m, MapGenerator gen, int count) {
for(int i = 0; i < count; i++) {
Pair p = gen.next();
m.put(p.key, p.value);
}
}
public static class
RandStringPairGenerator implements MapGenerator {
private Arrays2.RandStringGenerator gen;
public RandStringPairGenerator(int len) {
gen = new Arrays2.RandStringGenerator(len);
}
public Pair next() {
return new Pair(gen.next(), gen.next());
}
}
// Default object so you don't have to create your own:
public static RandStringPairGenerator rsp =
new RandStringPairGenerator(10);
public static class
StringPairGenerator implements MapGenerator {
private int index = -1;
private String[][] d;
public StringPairGenerator(String[][] data) {
d = data;
}
public Pair next() {
// Force the index to wrap:
index = (index + 1) % d.length;
return new Pair(d[index][0], d[index][1]);
}
public StringPairGenerator reset() {
index = -1;
return this;
}
}
// Use a predefined dataset:
public static StringPairGenerator geography =
new StringPairGenerator(CountryCapitals.pairs);
// Produce a sequence from a 2D array:
public static class StringGenerator implements Generator{
private String[][] d;
private int position;
private int index = -1;
public StringGenerator(String[][] data, int pos) {
d = data;
position = pos;
}
public Object next() {
// Force the index to wrap:
index = (index + 1) % d.length;
return d[index][position];
}
public StringGenerator reset() {
index = -1;
return this;
}
}
// Use a predefined dataset:
public static StringGenerator countries =
new StringGenerator(CountryCapitals.pairs, 0);
public static StringGenerator capitals =
new StringGenerator(CountryCapitals.pairs, 1);
} ///:~
Both versions of fill( ) take an argument that determines the number of
items to add to the container. In addition, there are two generators for the map: RandStringPairGenerator,
which creates any number of pairs of gibberish Strings with length determined by
the constructor argument; and StringPairGenerator, which produces pairs of Strings
given a two-dimensional array of String. The StringGenerator also takes a
two-dimensional array of String but generates single items rather than Pairs.
The static rsp, geography, countries, and capitals objects
provide prebuilt generators, the last three using all the countries of the world
and their capitals. Note that if you try to create more pairs than are available, the
generators will loop around to the beginning, and if you are putting the pairs into a Map,
the duplicates will just be ignored. .
Here is the predefined dataset, which consists of country names and their capitals:
//: com:bruceeckel:util:CountryCapitals.java
package com.bruceeckel.util;
public class CountryCapitals {
public static final String[][] pairs = {
// Africa
{"ALGERIA","Algiers"}, {"ANGOLA","Luanda"},
{"BENIN","Porto-Novo"}, {"BOTSWANA","Gaberone"},
{"BURKINA FASO","Ouagadougou"},
{"BURUNDI","Bujumbura"},
{"CAMEROON","Yaounde"}, {"CAPE VERDE","Praia"},
{"CENTRAL AFRICAN REPUBLIC","Bangui"},
{"CHAD","N'djamena"}, {"COMOROS","Moroni"},
{"CONGO","Brazzaville"}, {"DJIBOUTI","Dijibouti"},
{"EGYPT","Cairo"}, {"EQUATORIAL GUINEA","Malabo"},
{"ERITREA","Asmara"}, {"ETHIOPIA","Addis Ababa"},
{"GABON","Libreville"}, {"THE GAMBIA","Banjul"},
{"GHANA","Accra"}, {"GUINEA","Conakry"},
{"GUINEA","-"}, {"BISSAU","Bissau"},
{"COTE D'IVOIR (IVORY COAST)","Yamoussoukro"},
{"KENYA","Nairobi"}, {"LESOTHO","Maseru"},
{"LIBERIA","Monrovia"}, {"LIBYA","Tripoli"},
{"MADAGASCAR","Antananarivo"}, {"MALAWI","Lilongwe"},
{"MALI","Bamako"}, {"MAURITANIA","Nouakchott"},
{"MAURITIUS","Port Louis"}, {"MOROCCO","Rabat"},
{"MOZAMBIQUE","Maputo"}, {"NAMIBIA","Windhoek"},
{"NIGER","Niamey"}, {"NIGERIA","Abuja"},
{"RWANDA","Kigali"},
{"SAO TOME E PRINCIPE","Sao Tome"},
{"SENEGAL","Dakar"}, {"SEYCHELLES","Victoria"},
{"SIERRA LEONE","Freetown"}, {"SOMALIA","Mogadishu"},
{"SOUTH AFRICA","Pretoria/Cape Town"},
{"SUDAN","Khartoum"},
{"SWAZILAND","Mbabane"}, {"TANZANIA","Dodoma"},
{"TOGO","Lome"}, {"TUNISIA","Tunis"},
{"UGANDA","Kampala"},
{"DEMOCRATIC REPUBLIC OF THE CONGO (ZAIRE)",
"Kinshasa"},
{"ZAMBIA","Lusaka"}, {"ZIMBABWE","Harare"},
// Asia
{"AFGHANISTAN","Kabul"}, {"BAHRAIN","Manama"},
{"BANGLADESH","Dhaka"}, {"BHUTAN","Thimphu"},
{"BRUNEI","Bandar Seri Begawan"},
{"CAMBODIA","Phnom Penh"},
{"CHINA","Beijing"}, {"CYPRUS","Nicosia"},
{"INDIA","New Delhi"}, {"INDONESIA","Jakarta"},
{"IRAN","Tehran"}, {"IRAQ","Baghdad"},
{"ISRAEL","Tel Aviv"}, {"JAPAN","Tokyo"},
{"JORDAN","Amman"}, {"KUWAIT","Kuwait City"},
{"LAOS","Vientiane"}, {"LEBANON","Beirut"},
{"MALAYSIA","Kuala Lumpur"}, {"THE MALDIVES","Male"},
{"MONGOLIA","Ulan Bator"},
{"MYANMAR (BURMA)","Rangoon"},
{"NEPAL","Katmandu"}, {"NORTH KOREA","P'yongyang"},
{"OMAN","Muscat"}, {"PAKISTAN","Islamabad"},
{"PHILIPPINES","Manila"}, {"QATAR","Doha"},
{"SAUDI ARABIA","Riyadh"}, {"SINGAPORE","Singapore"},
{"SOUTH KOREA","Seoul"}, {"SRI LANKA","Colombo"},
{"SYRIA","Damascus"},
{"TAIWAN (REPUBLIC OF CHINA)","Taipei"},
{"THAILAND","Bangkok"}, {"TURKEY","Ankara"},
{"UNITED ARAB EMIRATES","Abu Dhabi"},
{"VIETNAM","Hanoi"}, {"YEMEN","Sana'a"},
// Australia and Oceania
{"AUSTRALIA","Canberra"}, {"FIJI","Suva"},
{"KIRIBATI","Bairiki"},
{"MARSHALL ISLANDS","Dalap-Uliga-Darrit"},
{"MICRONESIA","Palikir"}, {"NAURU","Yaren"},
{"NEW ZEALAND","Wellington"}, {"PALAU","Koror"},
{"PAPUA NEW GUINEA","Port Moresby"},
{"SOLOMON ISLANDS","Honaira"}, {"TONGA","Nuku'alofa"},
{"TUVALU","Fongafale"}, {"VANUATU","< Port-Vila"},
{"WESTERN SAMOA","Apia"},
// Eastern Europe and former USSR
{"ARMENIA","Yerevan"}, {"AZERBAIJAN","Baku"},
{"BELARUS (BYELORUSSIA)","Minsk"},
{"GEORGIA","Tbilisi"},
{"KAZAKSTAN","Almaty"}, {"KYRGYZSTAN","Alma-Ata"},
{"MOLDOVA","Chisinau"}, {"RUSSIA","Moscow"},
{"TAJIKISTAN","Dushanbe"}, {"TURKMENISTAN","Ashkabad"},
{"UKRAINE","Kyiv"}, {"UZBEKISTAN","Tashkent"},
// Europe
{"ALBANIA","Tirana"}, {"ANDORRA","Andorra la Vella"},
{"AUSTRIA","Vienna"}, {"BELGIUM","Brussels"},
{"BOSNIA","-"}, {"HERZEGOVINA","Sarajevo"},
{"CROATIA","Zagreb"}, {"CZECH REPUBLIC","Prague"},
{"DENMARK","Copenhagen"}, {"ESTONIA","Tallinn"},
{"FINLAND","Helsinki"}, {"FRANCE","Paris"},
{"GERMANY","Berlin"}, {"GREECE","Athens"},
{"HUNGARY","Budapest"}, {"ICELAND","Reykjavik"},
{"IRELAND","Dublin"}, {"ITALY","Rome"},
{"LATVIA","Riga"}, {"LIECHTENSTEIN","Vaduz"},
{"LITHUANIA","Vilnius"}, {"LUXEMBOURG","Luxembourg"},
{"MACEDONIA","Skopje"}, {"MALTA","Valletta"},
{"MONACO","Monaco"}, {"MONTENEGRO","Podgorica"},
{"THE NETHERLANDS","Amsterdam"}, {"NORWAY","Oslo"},
{"POLAND","Warsaw"}, {"PORTUGAL","Lisbon"},
{"ROMANIA","Bucharest"}, {"SAN MARINO","San Marino"},
{"SERBIA","Belgrade"}, {"SLOVAKIA","Bratislava"},
{"SLOVENIA","Ljujiana"}, {"SPAIN","Madrid"},
{"SWEDEN","Stockholm"}, {"SWITZERLAND","Berne"},
{"UNITED KINGDOM","London"}, {"VATICAN CITY","---"},
// North and Central America
{"ANTIGUA AND BARBUDA","Saint John's"},
{"BAHAMAS","Nassau"},
{"BARBADOS","Bridgetown"}, {"BELIZE","Belmopan"},
{"CANADA","Ottawa"}, {"COSTA RICA","San Jose"},
{"CUBA","Havana"}, {"DOMINICA","Roseau"},
{"DOMINICAN REPUBLIC","Santo Domingo"},
{"EL SALVADOR","San Salvador"},
{"GRENADA","Saint George's"},
{"GUATEMALA","Guatemala City"},
{"HAITI","Port-au-Prince"},
{"HONDURAS","Tegucigalpa"}, {"JAMAICA","Kingston"},
{"MEXICO","Mexico City"}, {"NICARAGUA","Managua"},
{"PANAMA","Panama City"}, {"ST. KITTS","-"},
{"NEVIS","Basseterre"}, {"ST. LUCIA","Castries"},
{"ST. VINCENT AND THE GRENADINES","Kingstown"},
{"UNITED STATES OF AMERICA","Washington, D.C."},
// South America
{"ARGENTINA","Buenos Aires"},
{"BOLIVIA","Sucre (legal)/La Paz(administrative)"},
{"BRAZIL","Brasilia"}, {"CHILE","Santiago"},
{"COLOMBIA","Bogota"}, {"ECUADOR","Quito"},
{"GUYANA","Georgetown"}, {"PARAGUAY","Asuncion"},
{"PERU","Lima"}, {"SURINAME","Paramaribo"},
{"TRINIDAD AND TOBAGO","Port of Spain"},
{"URUGUAY","Montevideo"}, {"VENEZUELA","Caracas"},
};
} ///:~
This is simply a two-dimensional array of String data.[56] Heres a simple test using the fill( )
methods and generators:
//: c11:FillTest.java
import com.bruceeckel.util.*;
import java.util.*;
public class FillTest {
private static Generator sg =
new Arrays2.RandStringGenerator(7);
public static void main(String[] args) {
List list = new ArrayList();
Collections2.fill(list, sg, 25);
System.out.println(list + "\n");
List list2 = new ArrayList();
Collections2.fill(list2, Collections2.capitals, 25);
System.out.println(list2 + "\n");
Set set = new HashSet();
Collections2.fill(set, sg, 25);
System.out.println(set + "\n");
Map m = new HashMap();
Collections2.fill(m, Collections2.rsp, 25);
System.out.println(m + "\n");
Map m2 = new HashMap();
Collections2.fill(m2, Collections2.geography, 25);
System.out.println(m2);
}
} ///:~
With these tools you can easily test the various containers by filling them with
interesting data. .
The disadvantage to using the Java containers is that you lose type
information when you put an object into a container. This happens because the programmer
of that container class had no idea what specific type you wanted to put in the container,
and making the container hold only your type would prevent it from being a general-purpose
tool. So instead, the container holds references to Object, which is the root of
all the classes, so it holds any type. (Of course, this doesnt include primitive
types, since they arent real objects, and thus, are not inherited from anything.)
This is a great solution, except: .
On the up side, Java wont let you misuse the objects that you put into a
container. If you throw a dog into a container of cats and then try to treat everything in
the container as a cat, youll get a RuntimeException when you pull the dog
reference out of the cat container and try to cast it to a cat. .
Heres an example using the basic workhorse container, ArrayList. For
starters, you can think of ArrayList as an array that automatically expands
itself. Using an ArrayList is straightforward: create one, put objects in
using add( ), and later get
them out with get( ) using an
indexjust like you would with an array, but without the square brackets.[57] ArrayList also has a method size( ) to let you know how many
elements have been added so you dont inadvertently run off the end and cause an
exception. .
First, Cat and Dog classes are created:
//: c11:Cat.java
package c11;
public class Cat {
private int catNumber;
public Cat(int i) { catNumber = i; }
public void id() {
System.out.println("Cat #" + catNumber);
}
} ///:~
//: c11:Dog.java
package c11;
public class Dog {
private int dogNumber;
public Dog(int i) { dogNumber = i; }
public void id() {
System.out.println("Dog #" + dogNumber);
}
} ///:~
Cats and Dogs are placed into the container, then pulled out:
//: c11:CatsAndDogs.java
// Simple container example.
// {ThrowsException}
package c11;
import java.util.*;
public class CatsAndDogs {
public static void main(String[] args) {
List cats = new ArrayList();
for(int i = 0; i < 7; i++)
cats.add(new Cat(i));
// Not a problem to add a dog to cats:
cats.add(new Dog(7));
for(int i = 0; i < cats.size(); i++)
((Cat)cats.get(i)).id();
// Dog is detected only at run time
}
} ///:~
The classes Cat and Dog are distinct; they have nothing in common except
that they are Objects. (If you dont explicitly say what class youre
inheriting from, you automatically inherit from Object.) Since ArrayList
holds Objects, you can not only put Cat objects into this container using
the ArrayList method add( ), but you can also add Dog objects
without complaint at either compile time or run time. When you go to fetch out what you
think are Cat objects using the ArrayList method get( ), you get
back a reference to an object that you must cast to a Cat. Then you need to
surround the entire expression with parentheses to force the evaluation of the cast before
calling the id( ) method for Cat; otherwise, youll get a syntax
error. Then, at run time, when you try to cast the Dog object to a Cat,
youll get an exception. .
This is more than just an annoyance. Its something that can create
difficult-to-find bugs. If one part (or several parts) of a program inserts objects into a
container, and you discover only in a separate part of the program through an exception
that a bad object was placed in the container, then you must find out where the bad insert
occurred. Most of the time this isnt a problem, but you should be aware of the
possibility. .
It turns out that in some cases things seem to work correctly without casting back to
your original type. One case is quite special: The String class has some extra help
from the compiler to make it work smoothly. Whenever the compiler expects a String
object and it hasnt got one, it will automatically call the toString( ) method thats defined in Object
and can be overridden by any Java class. This method produces the desired String
object, which is then used wherever it is wanted. .
Thus, all you need to do to make objects of your class print is to override the toString( )
method, as shown in the following example:
//: c11:Mouse.java
// Overriding toString().
public class Mouse {
private int mouseNumber;
public Mouse(int i) { mouseNumber = i; }
// Override Object.toString():
public String toString() {
return "This is Mouse #" + mouseNumber;
}
public int getNumber() { return mouseNumber; }
} ///:~
//: c11:MouseTrap.java
public class MouseTrap {
static void caughtYa(Object m) {
Mouse mouse = (Mouse)m; // Cast from Object
System.out.println("Mouse: " + mouse.getNumber());
}
} ///:~
//: c11:WorksAnyway.java
// In special cases, things just seem to work correctly.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class WorksAnyway {
private static Test monitor = new Test();
public static void main(String[] args) {
List mice = new ArrayList();
for(int i = 0; i < 3; i++)
mice.add(new Mouse(i));
for(int i = 0; i < mice.size(); i++) {
// No cast necessary, automatic
// call to Object.toString():
System.out.println("Free mouse: " + mice.get(i));
MouseTrap.caughtYa(mice.get(i));
}
monitor.expect(new String[] {
"Free mouse: This is Mouse #0",
"Mouse: 0",
"Free mouse: This is Mouse #1",
"Mouse: 1",
"Free mouse: This is Mouse #2",
"Mouse: 2"
});
}
} ///:~
You can see toString( ) overridden in Mouse. In the second for
loop in main( ) you find the statement:
System.out.println("Free mouse: " + mice.get(i));
After the + sign the compiler expects to see a String object. get( ) produces an Object,
so to get the desired String, the compiler implicitly calls toString( ).
Unfortunately, you can work this kind of magic only with String; it isnt
available for any other type. .
A second approach to hiding the cast has been placed inside MouseTrap. The caughtYa( )
method accepts not a Mouse, but an Object, which it then casts to a Mouse.
This is quite presumptuous, of course, since by accepting an Object, anything could
be passed to the method. However, if the cast is incorrectif you passed the wrong
typeyoull get an exception at run time. This is not as good as compile-time
checking, but its still robust. Note that in the use of this method:
MouseTrap.caughtYa(mice.get(i));
no cast is necessary. .
You might not want to give up on this
issue just yet. A more ironclad solution is to create a new class using the ArrayList,
such that it will accept only your type and produce only your type:
//: c11:MouseList.java
// A type-conscious List.
import java.util.*;
public class MouseList {
private List list = new ArrayList();
public void add(Mouse m) { list.add(m); }
public Mouse get(int index) {
return (Mouse)list.get(index);
}
public int size() { return list.size(); }
} ///:~
Heres a test for the new container:
//: c11:MouseListTest.java
import com.bruceeckel.simpletest.*;
public class MouseListTest {
private static Test monitor = new Test();
public static void main(String[] args) {
MouseList mice = new MouseList();
for(int i = 0; i < 3; i++)
mice.add(new Mouse(i));
for(int i = 0; i < mice.size(); i++)
MouseTrap.caughtYa(mice.get(i));
monitor.expect(new String[] {
"Mouse: 0",
"Mouse: 1",
"Mouse: 2"
});
}
} ///:~
This is similar to the previous example, except that the new MouseList class has
a private member of type ArrayList and methods just like ArrayList.
However, it doesnt accept and produce generic Objects, only Mouse
objects. .
Note that if MouseList had instead been inherited from ArrayList,
the add(Mouse) method would simply overload the existing add(Object), and
there would still be no restriction on what type of objects could be added, and you
wouldnt get the desired results. Using composition, the MouseList simply uses
the ArrayList, performing some activities before passing the responsibility for the
rest of the operation on to the ArrayList. .
Because a MouseList will accept only a Mouse, if you say:
mice.add(new Pigeon());
you will get an error message at compile time. This approach, while more tedious
from a coding standpoint, will tell you immediately if youre using a type
improperly. .
Note that no cast is necessary when using get( ); its always a Mouse.
.
This kind of problem isnt isolated. There are numerous cases in which you need to
create new types based on other types, and in which it is useful to have specific type
information at compile time. This is the concept of a parameterized type. In C++, this is directly supported by the
language using templates. It is
likely that Java JDK 1.5 will provide generics, the Java version of parameterized
types. .
In any container class, you must have a way to put things in and a way to get things
out. After all, thats the primary job of a containerto hold things. In the ArrayList, add( ) is the way that you insert
objects, and get( ) is one way to get things out. ArrayList is
quite flexible; you can select anything at any time, and select multiple elements at once
using different indexes. .
If you want to start thinking at a
higher level, theres a drawback: You need to know the exact type of the container in
order to use it. This might not seem bad at first, but what if you start out using an ArrayList,
and later on you discover that because of the features you need in the container you
actually need to use a Set instead? Or suppose youd like to write a piece of
generic code that doesnt know or care what type of container its working with,
so that it could be used on different types of containers without rewriting that code? .
The concept of an iterator (yet another design pattern)
can be used to achieve this abstraction. An iterator is an object whose job is to move
through a sequence of objects and select each object in that sequence without the client
programmer knowing or caring about the underlying structure of that sequence. In addition,
an iterator is usually whats called a light-weight object: one
thats cheap to create. For that reason, youll often find seemingly strange
constraints for iterators; for example, some iterators can move in only one direction. .
The Java Iterator is an example of an iterator with these kinds of constraints.
Theres not much you can do with one except: .
Thats all. Its a simple implementation of an iterator, but still powerful
(and theres a more sophisticated ListIterator for Lists). To see how
it works, lets revisit the CatsAndDogs.java program from earlier in this
chapter. In the original version, the method get( ) was used to select each
element, but in the following modified version, an Iterator is used: .
//: c11:CatsAndDogs2.java
// Simple container with Iterator.
package c11;
import com.bruceeckel.simpletest.*;
import java.util.*;
public class CatsAndDogs2 {
private static Test monitor = new Test();
public static void main(String[] args) {
List cats = new ArrayList();
for(int i = 0; i < 7; i++)
cats.add(new Cat(i));
Iterator e = cats.iterator();
while(e.hasNext())
((Cat)e.next()).id();
}
} ///:~
You can see that the last few lines now use an Iterator to step through the
sequence instead of a for loop. With the Iterator, you dont need to
worry about the number of elements in the container. Thats taken care of for you by hasNext( ) and next( ). .
As another example, consider the creation of a general-purpose printing method:
//: c11:Printer.java
// Using an Iterator.
import java.util.*;
public class Printer {
static void printAll(Iterator e) {
while(e.hasNext())
System.out.println(e.next());
}
} ///:~
Look closely at printAll( ). Note that theres no information about
the type of sequence. All you have is an Iterator, and thats all you need to
know about the sequence: that you can get the next object, and that you can know when
youre at the end. This idea of taking a container of objects and passing through it
to perform an operation on each one is powerful and will be seen throughout this book. .
The example is
even more generic, since it implicitly uses the Object.toString( ) method. The
println( ) method is overloaded for all the primitive types as well as Object;
in each case, a String is automatically produced by calling the appropriate toString( )
method. .
Although its unnecessary, you can be more explicit using a cast, which has the
effect of calling toString( ):
System.out.println((String)e.next());
In general, however, youll want to do something more than call Object
methods, so youll run up against the type-casting issue again. You must assume
youve gotten an Iterator to a sequence of the particular type youre
interested in, and cast the resulting objects to that type (getting a run-time exception
if youre wrong). .
We can test it by printing Hamsters:
//: c11:Hamster.java
public class Hamster {
private int hamsterNumber;
public Hamster(int hamsterNumber) {
this.hamsterNumber = hamsterNumber;
}
public String toString() {
return "This is Hamster #" + hamsterNumber;
}
} ///:~
//: c11:HamsterMaze.java
// Using an Iterator.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class HamsterMaze {
private static Test monitor = new Test();
public static void main(String[] args) {
List list = new ArrayList();
for(int i = 0; i < 3; i++)
list.add(new Hamster(i));
Printer.printAll(list.iterator());
monitor.expect(new String[] {
"This is Hamster #0",
"This is Hamster #1",
"This is Hamster #2"
});
}
} ///:~
You could write printAll( ) to accept a Collection object instead of
an Iterator, but the latter provides better decoupling. .
Because (like every other class) the Java standard containers are inherited from Object, they contain a toString( ) method.
This has been overridden so that they can produce a String representation of
themselves, including the objects they hold. Inside ArrayList, for example, the toString( )
steps through the elements of the ArrayList and calls toString( ) for
each one. Suppose youd like to print the address of your class. It seems to make
sense to simply refer to this (in particular, C++ programmers are prone to this
approach):
//: c11:InfiniteRecursion.java
// Accidental recursion.
// {RunByHand}
import java.util.*;
public class InfiniteRecursion {
public String toString() {
return " InfiniteRecursion address: " + this + "\n";
}
public static void main(String[] args) {
List v = new ArrayList();
for(int i = 0; i < 10; i++)
v.add(new InfiniteRecursion());
System.out.println(v);
}
} ///:~
If you simply create an InfiniteRecursion object and then print it, youll
get an endless sequence of exceptions. This is also true if you place the InfiniteRecursion
objects in an ArrayList and print that ArrayList as shown here. Whats
happening is automatic type conversion for Strings. When you say:
"InfiniteRecursion address: " + this
The compiler sees a String followed by a + and something
thats not a String, so it tries to convert this to a String. It
does this conversion by calling toString( ), which produces a recursive call. .
If you really do want to print the address of the object in
this case, the solution is to call the Object toString( ) method, which
does just that. So instead of saying this, youd say super.toString( ).
.
Collections and Maps may be implemented in different ways according to
your programming needs. Its helpful to look at a diagram of the Java containers (as
of JDK 1.4):

This diagram can be a bit overwhelming at first, but youll see that there are
really only three container componentsMap, List, and Setand
only two or three implementations of each one. The containers that you will generally use
most of the time have heavy black lines around them. When you see this, the containers are
not so daunting. .
The dotted boxes represent interfaces, the dashed boxes represent abstract
classes, and the solid boxes are regular (concrete) classes. The dotted-line arrows
indicate that a particular class is implementing an interface (or in the case of an
abstract class, partially implementing that interface). The solid arrows
show that a class can produce objects of the class the arrow is pointing to. For example,
any Collection can produce an Iterator and a List can produce a ListIterator
(as well as an ordinary Iterator, since List is inherited from Collection).
.
The interfaces that are concerned with holding objects are Collection, List,
Set, and Map. Ideally, youll write most of your code to talk to
these interfaces, and the only place where youll specify the precise type
youre using is at the point of creation. So you can create a List like this: .
List x = new LinkedList();
Of course, you can also decide to make x a LinkedList (instead of a
generic List) and carry the precise type information around with x.
The beauty (and the intent) of using the interface is that if you decide you want
to change your implementation, all you need to do is change it at the point of creation,
like this:
List x = new ArrayList();
The rest of your code can remain untouched (some of this genericity can also be
achieved with iterators). .
In the class hierarchy, you can see a number of classes whose names begin with Abstract,
and these can seem a bit confusing at first. They are simply tools that partially
implement a particular interface. If you were making your own Set, for example, you
wouldnt start with the Set interface and implement all the methods; instead,
youd inherit from AbstractSet and do the minimal
necessary work to make your new class. However, the containers library contains enough
functionality to satisfy your needs virtually all the time. So for our purposes, you can
ignore any class that begins with Abstract. .
Therefore, when you look at the diagram, youre really concerned with only those interfaces
at the top of the diagram and the concrete classes (those with solid boxes around them).
Youll typically make an object of a concrete class, upcast it to the corresponding interface,
and then use the interface throughout the rest of your code. In addition, you do
not need to consider the legacy elements when writing new code. Therefore, the diagram can
be greatly simplified to look like this:

Now it only includes the interfaces and classes that you will encounter on a regular
basis, and also the elements that we will focus on in this chapter. Note that the WeakHashMap
and the JDK 1.4 IdentityHashMap are not included on this diagram, because they
are special-purpose tools that you will rarely use. .
Heres a simple example that fills a Collection (represented here with an ArrayList)
with String objects and then prints each element in the Collection:
//: c11:SimpleCollection.java
// A simple example using Java 2 Collections.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class SimpleCollection {
private static Test monitor = new Test();
public static void main(String[] args) {
// Upcast because we just want to
// work with Collection features
Collection c = new ArrayList();
for(int i = 0; i < 10; i++)
c.add(Integer.toString(i));
Iterator it = c.iterator();
while(it.hasNext())
System.out.println(it.next());
monitor.expect(new String[] {
"0",
"1",
"2",
"3",
"4",
"5",
"6",
"7",
"8",
"9"
});
}
} ///:~
The first line in main( ) creates an ArrayList
object and then upcasts it to a Collection. Since this example uses only the Collection
methods, any object of a class inherited from Collection would work, but ArrayList
is the typical workhorse Collection. .
The add( ) method, as its name suggests, puts a new element in the Collection.
However, the documentation carefully states that add( ) ensures that
this Container contains the specified element. This is to allow for the meaning of Set,
which adds the element only if it isnt already there. With an ArrayList, or
any sort of List, add( ) always means put it in, because Lists
dont care if there are duplicates. .
All Collections can produce an Iterator via their
iterator( ) method. Here, an Iterator is created and used to traverse
the Collection, printing each element. .
The following table shows everything you can do with a Collection (not including
the methods that automatically come through with Object), and thus, everything you
can do with a Set or a List. (List also has additional
functionality.) Maps are not inherited from Collection and will be treated
separately.
| boolean
add(Object) |
Ensures that
the container holds the argument. Returns false if it doesnt add the argument. (This
is an optional method, described later in this chapter.) |
| boolean |
Adds all the
elements in the argument. Returns true if any elements were added.
(Optional.) |
| void
clear( ) |
Removes all
the elements in the container. (Optional.) |
| boolean |
true
if the container holds the argument. |
| boolean
containsAll(Collection) |
true
if the container holds all the elements in the argument. |
| boolean
isEmpty( ) |
true
if the container has no elements. |
| Iterator
iterator( ) |
Returns an Iterator
that you can use to move through the elements in the container. |
| boolean |
If the
argument is in the container, one instance of that element is removed. Returns true
if a removal occurred. (Optional.) |
| boolean
removeAll(Collection) |
Removes all
the elements that are contained in the argument. Returns true if any removals
occurred. (Optional.) |
| boolean
retainAll(Collection) |
Retains only
elements that are contained in the argument (an intersection from set theory).
Returns true if any changes occurred. (Optional.) |
| int
size( ) |
Returns the
number of elements in the container. |
| Object[]
toArray( ) |
Returns an
array containing all the elements in the container. |
| Object[] |
Returns an
array containing all the elements in the container, whose type is that of the array a
rather than plain Object (you must cast the array to the right type). |
Notice that theres no get( ) method for
random-access element selection. Thats because Collection also includes Set,
which maintains its own internal ordering (and thus makes random-access lookup
meaningless). Thus, if you want to examine the elements of a Collection, you must
use an iterator.
The following example demonstrates all of these methods. Again, these work with
anything that implements Collection, but an ArrayList is used as a kind of
least-common denominator: .
//: c11:Collection1.java
// Things you can do with all Collections.
import com.bruceeckel.simpletest.*;
import java.util.*;
import com.bruceeckel.util.*;
public class Collection1 {
private static Test monitor = new Test();
public static void main(String[] args) {
Collection c = new ArrayList();
Collections2.fill(c, Collections2.countries, 5);
c.add("ten");
c.add("eleven");
System.out.println(c);
// Make an array from the List:
Object[] array = c.toArray();
// Make a String array from the List:
String[] str = (String[])c.toArray(new String[1]);
// Find max and min elements; this means
// different things depending on the way
// the Comparable interface is implemented:
System.out.println("Collections.max(c) = " +
Collections.max(c));
System.out.println("Collections.min(c) = " +
Collections.min(c));
// Add a Collection to another Collection
Collection c2 = new ArrayList();
Collections2.fill(c2, Collections2.countries, 5);
c.addAll(c2);
System.out.println(c);
c.remove(CountryCapitals.pairs[0][0]);
System.out.println(c);
c.remove(CountryCapitals.pairs[1][0]);
System.out.println(c);
// Remove all components that are
// in the argument collection:
c.removeAll(c2);
System.out.println(c);
c.addAll(c2);
System.out.println(c);
// Is an element in this Collection?
String val = CountryCapitals.pairs[3][0];
System.out.println("c.contains(" + val + ") = "
+ c.contains(val));
// Is a Collection in this Collection?
System.out.println(
"c.containsAll(c2) = " + c.containsAll(c2));
Collection c3 = ((List)c).subList(3, 5);
// Keep all the elements that are in both
// c2 and c3 (an intersection of sets):
c2.retainAll(c3);
System.out.println(c);
// Throw away all the elements
// in c2 that also appear in c3:
c2.removeAll(c3);
System.out.println("c.isEmpty() = " + c.isEmpty());
c = new ArrayList();
Collections2.fill(c, Collections2.countries, 5);
System.out.println(c);
c.clear(); // Remove all elements
System.out.println("after c.clear():");
System.out.println(c);
monitor.expect(new String[] {
"[ALGERIA, ANGOLA, BENIN, BOTSWANA, BURKINA FASO, " +
"ten, eleven]",
"Collections.max(c) = ten",
"Collections.min(c) = ALGERIA",
"[ALGERIA, ANGOLA, BENIN, BOTSWANA, BURKINA FASO, " +
"ten, eleven, BURUNDI, CAMEROON, CAPE VERDE, " +
"CENTRAL AFRICAN REPUBLIC, CHAD]",
"[ANGOLA, BENIN, BOTSWANA, BURKINA FASO, ten, " +
"eleven, BURUNDI, CAMEROON, CAPE VERDE, " +
"CENTRAL AFRICAN REPUBLIC, CHAD]",
"[BENIN, BOTSWANA, BURKINA FASO, ten, eleven, " +
"BURUNDI, CAMEROON, CAPE VERDE, " +
"CENTRAL AFRICAN REPUBLIC, CHAD]",
"[BENIN, BOTSWANA, BURKINA FASO, ten, eleven]",
"[BENIN, BOTSWANA, BURKINA FASO, ten, eleven, " +
"BURUNDI, CAMEROON, CAPE VERDE, " +
"CENTRAL AFRICAN REPUBLIC, CHAD]",
"c.contains(BOTSWANA) = true",
"c.containsAll(c2) = true",
"[BENIN, BOTSWANA, BURKINA FASO, ten, eleven, " +
"BURUNDI, CAMEROON, CAPE VERDE, " +
"CENTRAL AFRICAN REPUBLIC, CHAD]",
"c.isEmpty() = false",
"[COMOROS, CONGO, DJIBOUTI, EGYPT, " +
"EQUATORIAL GUINEA]",
"after c.clear():",
"[]"
});
}
} ///:~
ArrayLists are created containing different sets of data and upcast to Collection
objects, so its clear that nothing other than the Collection interface is
being used. main( ) uses simple exercises to show all of the methods in Collection.
.
The following sections describe the various implementations of List, Set,
and Map and indicate in each case (with an asterisk) which one should be your
default choice. Youll notice that the legacy classes Vector, Stack,
and Hashtable are not included, because in all cases there are preferred
classes within the Java 2 Containers. .
The basic List is quite simple to use, as youve seen so
far with ArrayList. Although most of the time youll just use add( )
to insert objects, get( ) to get them out one at a time, and iterator( )
to get an Iterator for the sequence, theres also a set of other methods that
can be useful. .
In addition, there are actually two types of List: the basic ArrayList,
which excels at randomly accessing elements, and the much more powerful LinkedList,
which is not designed for fast random access, but has a much more general set of methods.
| List
(interface) |
Order is the
most important feature of a List; it promises to maintain elements in a particular
sequence. List adds a number of methods to Collection that allow insertion
and removal of elements in the middle of a List. (This is recommended only for a LinkedList.)
A List will produce a ListIterator, and by using this you can traverse the List
in both directions, as well as insert and remove elements in the middle of the List. |
| ArrayList* |
A List
implemented with an array. Allows rapid random access to elements, but is slow when
inserting and removing elements from the middle of a list. ListIterator should be
used only for back-and-forth traversal of an ArrayList, but not for inserting and
removing elements, which is expensive compared to LinkedList. |
| LinkedList |
Provides
optimal sequential access, with inexpensive insertions and deletions from the middle of
the List. Relatively slow for random access. (Use ArrayList instead.) Also
has addFirst( ), addLast( ), getFirst( ), getLast( ),
removeFirst( ), and removeLast( ) (which are not defined in any
interfaces or base classes) to allow it to be used as a stack, a queue, and a deque. |
The methods in the following example each cover a different group
of activities: things that every list can do (basicTest( )), moving around
with an Iterator (iterMotion( )) versus changing things with an Iterator
(iterManipulation( )), seeing the effects of List manipulation (testVisual( )),
and operations available only to LinkedLists. .
//: c11:List1.java
// Things you can do with Lists.
import java.util.*;
import com.bruceeckel.util.*;
public class List1 {
public static List fill(List a) {
Collections2.countries.reset();
Collections2.fill(a, Collections2.countries, 10);
return a;
}
private static boolean b;
private static Object o;
private static int i;
private static Iterator it;
private static ListIterator lit;
public static void basicTest(List a) {
a.add(1, "x"); // Add at location 1
a.add("x"); // Add at end
// Add a collection:
a.addAll(fill(new ArrayList()));
// Add a collection starting at location 3:
a.addAll(3, fill(new ArrayList()));
b = a.contains("1"); // Is it in there?
// Is the entire collection in there?
b = a.containsAll(fill(new ArrayList()));
// Lists allow random access, which is cheap
// for ArrayList, expensive for LinkedList:
o = a.get(1); // Get object at location 1
i = a.indexOf("1"); // Tell index of object
b = a.isEmpty(); // Any elements inside?
it = a.iterator(); // Ordinary Iterator
lit = a.listIterator(); // ListIterator
lit = a.listIterator(3); // Start at loc 3
i = a.lastIndexOf("1"); // Last match
a.remove(1); // Remove location 1
a.remove("3"); // Remove this object
a.set(1, "y"); // Set location 1 to "y"
// Keep everything that's in the argument
// (the intersection of the two sets):
a.retainAll(fill(new ArrayList()));
// Remove everything that's in the argument:
a.removeAll(fill(new ArrayList()));
i = a.size(); // How big is it?
a.clear(); // Remove all elements
}
public static void iterMotion(List a) {
ListIterator it = a.listIterator();
b = it.hasNext();
b = it.hasPrevious();
o = it.next();
i = it.nextIndex();
o = it.previous();
i = it.previousIndex();
}
public static void iterManipulation(List a) {
ListIterator it = a.listIterator();
it.add("47");
// Must move to an element after add():
it.next();
// Remove the element that was just produced:
it.remove();
// Must move to an element after remove():
it.next();
// Change the element that was just produced:
it.set("47");
}
public static void testVisual(List a) {
System.out.println(a);
List b = new ArrayList();
fill(b);
System.out.print("b = ");
System.out.println(b);
a.addAll(b);
a.addAll(fill(new ArrayList()));
System.out.println(a);
// Insert, remove, and replace elements
// using a ListIterator:
ListIterator x = a.listIterator(a.size()/2);
x.add("one");
System.out.println(a);
System.out.println(x.next());
x.remove();
System.out.println(x.next());
x.set("47");
System.out.println(a);
// Traverse the list backwards:
x = a.listIterator(a.size());
while(x.hasPrevious())
System.out.print(x.previous() + " ");
System.out.println();
System.out.println("testVisual finished");
}
// There are some things that only LinkedLists can do:
public static void testLinkedList() {
LinkedList ll = new LinkedList();
fill(ll);
System.out.println(ll);
// Treat it like a stack, pushing:
ll.addFirst("one");
ll.addFirst("two");
System.out.println(ll);
// Like "peeking" at the top of a stack:
System.out.println(ll.getFirst());
// Like popping a stack:
System.out.println(ll.removeFirst());
System.out.println(ll.removeFirst());
// Treat it like a queue, pulling elements
// off the tail end:
System.out.println(ll.removeLast());
// With the above operations, it's a dequeue!
System.out.println(ll);
}
public static void main(String[] args) {
// Make and fill a new list each time:
basicTest(fill(new LinkedList()));
basicTest(fill(new ArrayList()));
iterMotion(fill(new LinkedList()));
iterMotion(fill(new ArrayList()));
iterManipulation(fill(new LinkedList()));
iterManipulation(fill(new ArrayList()));
testVisual(fill(new LinkedList()));
testLinkedList();
}
} ///:~
In basicTest( ) and iterMotion( ) the calls are made in order
to show the proper syntax, and although the return value is captured, it is not used. In
some cases, the return value isnt captured at all. You should look up the full usage
of each of these methods in the JDK documentation from java.sun.com before you use
them. .
Remember that a container is only a storage cabinet to hold objects. If that cabinet
solves all of your needs, it doesnt really matter how it is implemented (a basic
concept with most types of objects). If youre working in a programming environment
that has built-in overhead due to other factors, then the cost difference between an ArrayList
and a LinkedList might not matter. You might need only one type of sequence. You
can even imagine the perfect container abstraction, which can automatically
change its underlying implementation according to the way it is used. .
A stack is sometimes referred to as a last-in, first-out (LIFO) container.
That is, whatever you push on the stack last is the first item you can
pop out. Like all of the other containers in Java, what you push and pop are Objects, so you must cast what you
pop, unless youre just using Object behavior. .
The LinkedList has methods that directly implement stack functionality, so you
can also just use a LinkedList rather than making a stack class. However, a stack
class can sometimes tell the story better: .
//: c11:StackL.java
// Making a stack from a LinkedList.
import com.bruceeckel.simpletest.*;
import java.util.*;
import com.bruceeckel.util.*;
public class StackL {
private static Test monitor = new Test();
private LinkedList list = new LinkedList();
public void push(Object v) { list.addFirst(v); }
public Object top() { return list.getFirst(); }
public Object pop() { return list.removeFirst(); }
public static void main(String[] args) {
StackL stack = new StackL();
for(int i = 0; i < 10; i++)
stack.push(Collections2.countries.next());
System.out.println(stack.top());
System.out.println(stack.top());
System.out.println(stack.pop());
System.out.println(stack.pop());
System.out.println(stack.pop());
monitor.expect(new String[] {
"CHAD",
"CHAD",
"CHAD",
"CENTRAL AFRICAN REPUBLIC",
"CAPE VERDE"
});
}
} ///:~
If you want only stack behavior, inheritance is inappropriate here because it would
produce a class with all the rest of the LinkedList methods (youll see later
that this very mistake was made by the Java 1.0 library designers with Stack). .
A queue is a first-in, first-out
(FIFO) container. That is, you put things in at one end and pull them out at the other. So
the order in which you put them in will be the same order that they come out. LinkedList has methods to support
queue behavior, so these can be used in a Queue class: .
//: c11:Queue.java
// Making a queue from a LinkedList.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class Queue {
private static Test monitor = new Test();
private LinkedList list = new LinkedList();
public void put(Object v) { list.addFirst(v); }
public Object get() { return list.removeLast(); }
public boolean isEmpty() { return list.isEmpty(); }
public static void main(String[] args) {
Queue queue = new Queue();
for(int i = 0; i < 10; i++)
queue.put(Integer.toString(i));
while(!queue.isEmpty())
System.out.println(queue.get());
monitor.expect(new String[] {
"0",
"1",
"2",
"3",
"4",
"5",
"6",
"7",
"8",
"9"
});
}
} ///:~
You can also easily create a deque (double-ended queue) from a LinkedList.
This is like a queue, but you can add and remove elements from either end. .
Set has exactly the same interface as Collection,
so there isnt any extra functionality like there is with the two different Lists.
Instead, the Set is exactly a Collectionit just has different
behavior. (This is the ideal use of inheritance and polymorphism: to express different
behavior.) A Set refuses to hold more than one instance of each object value (what
constitutes the value of an object is more complex, as you shall see).
| Each element
that you add to the Set must be unique; otherwise, the Set doesnt add
the duplicate element. Objects added to a Set must define equals( )
to establish object uniqueness. Set has exactly the same interface as Collection.
The Set interface does not guarantee that it will maintain its elements in any
particular order. |
|
| HashSet* |
For Sets
where fast lookup time is important. Objects must also define hashCode( ). |
| TreeSet |
An ordered Set
backed by a tree. This way, you can extract an ordered sequence from a Set. |
| LinkedHashSet |
Has the
lookup speed of a HashSet, but maintains the order in which you add the elements
(the insertion order), internally using a linked list. Thus, when you iterate through the Set,
the results appear in insertion order. |
The following example does not show everything you can do
with a Set, since the interface is the same as Collection, and so was
exercised in the previous example. Instead, this demonstrates the behavior that makes a Set
unique: .
//: c11:Set1.java
// Things you can do with Sets.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class Set1 {
private static Test monitor = new Test();
static void fill(Set s) {
s.addAll(Arrays.asList(
"one two three four five six seven".split(" ")));
}
public static void test(Set s) {
// Strip qualifiers from class name:
System.out.println(
s.getClass().getName().replaceAll("\\w+\\.", ""));
fill(s); fill(s); fill(s);
System.out.println(s); // No duplicates!
// Add another set to this one:
s.addAll(s);
s.add("one");
s.add("one");
s.add("one");
System.out.println(s);
// Look something up:
System.out.println("s.contains(\"one\"): " +
s.contains("one"));
}
public static void main(String[] args) {
test(new HashSet());
test(new TreeSet());
test(new LinkedHashSet());
monitor.expect(new String[] {
"HashSet",
"[one, two, five, four, three, seven, six]",
"[one, two, five, four, three, seven, six]",
"s.contains(\"one\"): true",
"TreeSet",
"[five, four, one, seven, six, three, two]",
"[five, four, one, seven, six, three, two]",
"s.contains(\"one\"): true",
"LinkedHashSet",
"[one, two, three, four, five, six, seven]",
"[one, two, three, four, five, six, seven]",
"s.contains(\"one\"): true"
});
}
} ///:~
Duplicate values are added to the Set, but when it is printed, youll see
that the Set has accepted only one instance of each value. .
When you run this program, youll notice that the order maintained by the HashSet
is different from TreeSet and LinkedHashSet, since each has a different way
of storing elements so they can be located later. (TreeSet keeps elements sorted
into a red-black tree data structure, whereas HashSet uses a hashing function,
which is designed specifically for rapid lookups. LinkedHashSet uses hashing
internally for lookup speed, but appears to maintain elements in insertion order
using a linked list.) When creating your own types, be aware that a Set needs a way
to maintain a storage order, which means that you must implement the Comparable interface
and define the compareTo( ) method. Heres an example: .
//: c11:Set2.java
// Putting your own type in a Set.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class Set2 {
private static Test monitor = new Test();
public static Set fill(Set a, int size) {
for(int i = 0; i < size; i++)
a.add(new MyType(i));
return a;
}
public static void test(Set a) {
fill(a, 10);
fill(a, 10); // Try to add duplicates
fill(a, 10);
a.addAll(fill(new TreeSet(), 10));
System.out.println(a);
}
public static void main(String[] args) {
test(new HashSet());
test(new TreeSet());
test(new LinkedHashSet());
monitor.expect(new String[] {
"[2 , 4 , 9 , 8 , 6 , 1 , 3 , 7 , 5 , 0 ]",
"[9 , 8 , 7 , 6 , 5 , 4 , 3 , 2 , 1 , 0 ]",
"[0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ]"
});
}
} ///:~
The form for the definitions for equals( ) and hashCode( ) will be described
later in this chapter. You must define an equals( ) in both cases, but the hashCode( )
is absolutely necessary only if the class will be placed in a HashSet (which is
likely, since that should generally be your first choice as a Set implementation).
However, as a programming style, you should always override hashCode( ) when
you override equals( ). This process will be fully detailed later in this
chapter. .
In the compareTo( ), note that I did not use the simple and
obvious form return i-i2. Although this is a common programming error, it
would only work properly if i and i2 were unsigned ints
(if Java had an unsigned keyword, which it does not). It breaks for
Javas signed int, which is not big enough to represent the difference of two
signed ints. If i is a large positive integer and j is a large
negative integer, i-j will overflow and return a negative value, which will not
work. .
If you have a SortedSet (of which TreeSet is the only one available), the
elements are guaranteed to be in sorted order, which allows additional functionality to be
provided with these methods in the SortedSet interface: .
Comparator comparator( ): Produces the Comparator
used for this Set, or null for natural ordering.
Object first( ): Produces the lowest element.
Object last( ): Produces the highest element.
SortedSet subSet(fromElement, toElement): Produces a view of
this Set with elements from fromElement, inclusive, to toElement,
exclusive.
SortedSet headSet(toElement): Produces a view of this Set
with elements less than toElement.
SortedSet tailSet(fromElement): Produces a view of this Set
with elements greater than or equal to fromElement.
Heres a simple demonstration:
//: c11:SortedSetDemo.java
// What you can do with a TreeSet.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class SortedSetDemo {
private static Test monitor = new Test();
public static void main(String[] args) {
SortedSet sortedSet = new TreeSet(Arrays.asList(
"one two three four five six seven eight".split(" ")));
System.out.println(sortedSet);
Object
low = sortedSet.first(),
high = sortedSet.last();
System.out.println(low);
System.out.println(high);
Iterator it = sortedSet.iterator();
for(int i = 0; i <= 6; i++) {
if(i == 3) low = it.next();
if(i == 6) high = it.next();
else it.next();
}
System.out.println(low);
System.out.println(high);
System.out.println(sortedSet.subSet(low, high));
System.out.println(sortedSet.headSet(high));
System.out.println(sortedSet.tailSet(low));
monitor.expect(new String[] {
"[eight, five, four, one, seven, six, three, two]",
"eight",
"two",
"one",
"two",
"[one, seven, six, three]",
"[eight, five, four, one, seven, six, three]",
"[one, seven, six, three, two]"
});
}
} ///:~
Note that SortedSet means sorted according to the comparison function of
the object, not insertion order.
An ArrayList
allows you to select from a sequence of objects using a number, so in a sense it
associates numbers to objects. But what if youd like to select from a sequence of
objects using some other criterion? A stack is an example. Its selection criterion is
the last thing pushed on the stack. A powerful twist on this idea of
selecting from a sequence is termed a map, a dictionary,
or an associative array (you saw a
simple example of this in AssociativeArray.java in the previous chapter).
Conceptually, it seems like an ArrayList, but instead of looking up objects using a
number, you look them up using another object! This is a key technique in programming. .
The concept shows up in Java as the Map interface. The put(Object key, Object
value) method adds a value (the thing you want) and associates it with a key (the
thing you look it up with). get(Object key) produces the value given the
corresponding key. You can also test a Map to see if it contains a key or a value
with containsKey( ) and containsValue( ). .
The standard Java library contains different types of Maps: HashMap, TreeMap,
LinkedHashMap, WeakHashMap, and IdentityHashMap.
The all have the same basic Map interface, but they differ in behaviors including
efficiency, order in which the pairs are held and presented, how long the objects are held
by the map, and how key equality is determined. .
A big issue with maps is performance. If you look at what must be done for a get( ),
it seems pretty slow to search through (for example) an ArrayList for the key. This
is where HashMap speeds things up. Instead of a slow search for the key, it uses a
special value called a hash code. The hash code is a way to take some
information in the object in question and turn it into a relatively unique int
for that object. All Java objects can produce a hash code, and hashCode( )
is a method in the root class Object. A HashMap takes
the hashCode( ) of the object and uses it to quickly hunt for the key. This
results in a dramatic performance improvement.[58]
.
| Map
(interface) |
Maintains
key-value associations (pairs) so you can look up a value using a key. |
| HashMap* |
Implementation
based on a hash table. (Use this instead of Hashtable.) Provides constant-time
performance for inserting and locating pairs. Performance can be adjusted via constructors
that allow you to set the capacity and load factor of the hash table. |
| LinkedHashMap |
Like a HashMap,
but when you iterate through it, you get the pairs in insertion order, or in
least-recently-used (LRU) order. Only slightly slower than a HashMap, except when
iterating, where it is faster due to the linked list used to maintain the internal
ordering. |
| TreeMap |
Implementation
based on a red-black tree. When you view the keys or the pairs, they will be in sorted
order (determined by Comparable or Comparator, discussed later). The point
of a TreeMap is that you get the results in sorted order. TreeMap is the
only Map with the subMap( ) method, which allows you to return a
portion of the tree. |
| WeakHashMap |
A map of weak
keys that allow objects referred to by the map to be released; designed to solve
certain types of problems. If no references outside the map are held to a particular key,
it may be garbage collected. |
| IdentityHashMap |
A hash map
that uses == instead of equals( ) to compare keys. Only for solving
special types of problems; not for general use. |
Hashing is the most commonly used way to store elements in a map.
Sometimes youll need to know the details of how hashing works, so well look at
that a little later.
The following example uses the Collections2.fill( ) method and the test
data sets that were previously defined: .
//: c11:Map1.java
// Things you can do with Maps.
import java.util.*;
import com.bruceeckel.util.*;
public class Map1 {
private static Collections2.StringPairGenerator geo =
Collections2.geography;
private static Collections2.RandStringPairGenerator
rsp = Collections2.rsp;
// Producing a Set of the keys:
public static void printKeys(Map map) {
System.out.print("Size = " + map.size() + ", ");
System.out.print("Keys: ");
System.out.println(map.keySet());
}
public static void test(Map map) {
// Strip qualifiers from class name:
System.out.println(
map.getClass().getName().replaceAll("\\w+\\.", ""));
Collections2.fill(map, geo, 25);
// Map has 'Set' behavior for keys:
Collections2.fill(map, geo.reset(), 25);
printKeys(map);
// Producing a Collection of the values:
System.out.print("Values: ");
System.out.println(map.values());
System.out.println(map);
String key = CountryCapitals.pairs[4][0];
String value = CountryCapitals.pairs[4][1];
System.out.println("map.containsKey(\"" + key +
"\"): " + map.containsKey(key));
System.out.println("map.get(\"" + key + "\"): "
+ map.get(key));
System.out.println("map.containsValue(\""
+ value + "\"): " + map.containsValue(value));
Map map2 = new TreeMap();
Collections2.fill(map2, rsp, 25);
map.putAll(map2);
printKeys(map);
key = map.keySet().iterator().next().toString();
System.out.println("First key in map: " + key);
map.remove(key);
printKeys(map);
map.clear();
System.out.println("map.isEmpty(): " + map.isEmpty());
Collections2.fill(map, geo.reset(), 25);
// Operations on the Set change the Map:
map.keySet().removeAll(map.keySet());
System.out.println("map.isEmpty(): " + map.isEmpty());
}
public static void main(String[] args) {
test(new HashMap());
test(new TreeMap());
test(new LinkedHashMap());
test(new IdentityHashMap());
test(new WeakHashMap());
}
} ///:~
The printKeys( ) and printValues( ) methods are not only useful
utilities, they also demonstrate how to produce Collection views of a Map.
The keySet( ) method produces a Set backed by the keys in the Map.
Similar treatment is given to values( ), which produces a Collection
containing all the values in the Map. (Note that keys must be unique, but values
may contain duplicates.) Since these Collections are backed by the Map, any
changes in a Collection will be reflected in the associated Map. .
The rest of the program provides simple examples of each Map operation and tests
each type of Map. .
As an example of the use of a HashMap, consider a program to check the
randomness of Javas Random
class. Ideally, it would produce a perfect distribution of random numbers, but to test
this you need to generate a bunch of random numbers and count the ones that fall in the
various ranges. A HashMap is perfect for this, since it associates objects with
objects (in this case, the value object contains the number produced by Math.random( )
along with the number of times that number appears):
//: c11:Statistics.java
// Simple demonstration of HashMap.
import java.util.*;
class Counter {
int i = 1;
public String toString() { return Integer.toString(i); }
}
public class Statistics {
private static Random rand = new Random();
public static void main(String[] args) {
Map hm = new HashMap();
for(int i = 0; i < 10000; i++) {
// Produce a number between 0 and 20:
Integer r = new Integer(rand.nextInt(20));
if(hm.containsKey(r))
((Counter)hm.get(r)).i++;
else
hm.put(r, new Counter());
}
System.out.println(hm);
}
} ///:~
In main( ), each time a random number is generated it is wrapped inside an Integer
object so that reference can be used with the HashMap. (You cant use a
primitive with a containeronly an object reference.) The containsKey( )
method checks to see if this key is already in the container (that is, has the number been
found already?). If so, the get( ) method produces the
associated value for the key, which in this case is a Counter object. The value i
inside the counter is incremented to indicate that one more of this particular random
number has been found. .
If the key has not been found yet, the method put( )
will place a new key-value pair into the HashMap. Since Counter
automatically initializes its variable i to one when its created, it
indicates the first occurrence of this particular random number. .
To display the HashMap, it is simply printed. The HashMap toString( )
method moves through all the key-value pairs and calls the toString( ) for
each one. The Integer.toString( ) is predefined, and you can see the toString( )
for Counter. The output from one run (with some line breaks added) is:
{15=529, 4=488, 19=518, 8=487, 11=501, 16=487, 18=507, 3=524, 7=474, 12=485, 17=493, 2=490, 13=540, 9=453, 6=512, 1=466, 14=522, 10=471, 5=522, 0=531}
You might wonder at the necessity of the class Counter, which seems like it
doesnt even have the functionality of the wrapper class Integer. Why not use int
or Integer? Well, you cant use an int because all of the containers
can hold only Object references. After youve seen containers, the wrapper
classes might begin to make a little more sense to you, since you cant put any of
the primitive types in containers. However, the only thing you can do with the Java
wrappers is initialize them to a particular value and read that value. That is,
theres no way to change a value once a wrapper object has been created. This makes
the Integer wrapper immediately useless to solve the
problem, so were forced to create a new class that does satisfy the need. .
If you have a SortedMap (of which TreeMap is the only one available), the
keys are guaranteed to be in sorted order, which allows additional functionality to be
provided with these methods in the SortedMap interface: .
Comparator comparator( ): Produces the comparator used
for this Map, or null for natural ordering.
Object firstKey( ): Produces the lowest key.
Object lastKey( ): Produces the highest key.
SortedMap subMap(fromKey, toKey): Produces a view of this Map
with keys from fromKey, inclusive, to toKey, exclusive.
SortedMap headMap(toKey): Produces a view of this Map
with keys less than toKey.
SortedMap tailMap(fromKey): Produces a view of this Map
with keys greater than or equal to fromKey. .
Heres an example thats similar to SortedSetDemo.java and shows this
additional behavior of TreeMaps:
//: c11:SimplePairGenerator.java
import com.bruceeckel.util.*;
//import java.util.*;
public class SimplePairGenerator implements MapGenerator {
public Pair[] items = {
new Pair("one", "A"), new Pair("two", "B"),
new Pair("three", "C"), new Pair("four", "D"),
new Pair("five", "E"), new Pair("six", "F"),
new Pair("seven", "G"), new Pair("eight", "H"),
new Pair("nine", "I"), new Pair("ten", "J")
};
private int index = -1;
public Pair next() {
index = (index + 1) % items.length;
return items[index];
}
public static SimplePairGenerator gen =
new SimplePairGenerator();
} ///:~
//: c11:SortedMapDemo.java
// What you can do with a TreeMap.
import com.bruceeckel.simpletest.*;
import com.bruceeckel.util.*;
import java.util.*;
public class SortedMapDemo {
private static Test monitor = new Test();
public static void main(String[] args) {
TreeMap sortedMap = new TreeMap();
Collections2.fill(
sortedMap, SimplePairGenerator.gen, 10);
System.out.println(sortedMap);
Object
low = sortedMap.firstKey(),
high = sortedMap.lastKey();
System.out.println(low);
System.out.println(high);
Iterator it = sortedMap.keySet().iterator();
for(int i = 0; i <= 6; i++) {
if(i == 3) low = it.next();
if(i == 6) high = it.next();
else it.next();
}
System.out.println(low);
System.out.println(high);
System.out.println(sortedMap.subMap(low, high));
System.out.println(sortedMap.headMap(high));
System.out.println(sortedMap.tailMap(low));
monitor.expect(new String[] {
"{eight=H, five=E, four=D, nine=I, one=A, seven=G," +
" six=F, ten=J, three=C, two=B}",
"eight",
"two",
"nine",
"ten",
"{nine=I, one=A, seven=G, six=F}",
"{eight=H, five=E, four=D, nine=I, " +
"one=A, seven=G, six=F}",
"{nine=I, one=A, seven=G, six=F, " +
"ten=J, three=C, two=B}"
});
}
} ///:~
Here, the pairs are stored by key-sorted order. Because there is a sense of order in
the TreeMap, the concept of location makes sense, so you can have first
and last elements and submaps. .
The LinkedHashMap hashes
everything for speed, but also produces the pairs in insertion order during a traversal (println( )
iterates through the map, so you see the results of traversal). In addition, a LinkedHashMap
can be configured in the constructor to use a least-recently-used (LRU) algorithm
based on accesses, so elements that havent been accessed (and thus are candidates
for removal) appear at the front of the list. This allows easy creation of programs that
do periodic cleanup in order to save space. Heres a simple example showing both
features: .
//: c11:LinkedHashMapDemo.java
// What you can do with a LinkedHashMap.
import com.bruceeckel.simpletest.*;
import com.bruceeckel.util.*;
import java.util.*;
public class LinkedHashMapDemo {
private static Test monitor = new Test();
public static void main(String[] args) {
LinkedHashMap linkedMap = new LinkedHashMap();
Collections2.fill(
linkedMap, SimplePairGenerator.gen, 10);
System.out.println(linkedMap);
// Least-recently used order:
linkedMap = new LinkedHashMap(16, 0.75f, true);
Collections2.fill(
linkedMap, SimplePairGenerator.gen, 10);
System.out.println(linkedMap);
for(int i = 0; i < 7; i++) // Cause accesses:
linkedMap.get(SimplePairGenerator.gen.items[i].key);
System.out.println(linkedMap);
linkedMap.get(SimplePairGenerator.gen.items[0].key);
System.out.println(linkedMap);
monitor.expect(new String[] {
"{one=A, two=B, three=C, four=D, five=E, " +
"six=F, seven=G, eight=H, nine=I, ten=J}",
"{one=A, two=B, three=C, four=D, five=E, " +
"six=F, seven=G, eight=H, nine=I, ten=J}",
"{eight=H, nine=I, ten=J, one=A, two=B, " +
"three=C, four=D, five=E, six=F, seven=G}",
"{eight=H, nine=I, ten=J, two=B, three=C, " +
"four=D, five=E, six=F, seven=G, one=A}"
});
}
} ///:~
You can see from the output that the pairs are indeed traversed in insertion order,
even for the LRU version. However, after the first seven items (only) are accessed, the
last three items move to the front of the list. Then, when one is
accessed again, it moves to the back of the list. .
In Statistics.java, a standard
library class (Integer) was used as a key for the HashMap. It worked because
it has all the necessary wiring to make it behave correctly as a key. But a common pitfall
occurs with HashMaps when you create your own classes to be used as keys. For
example, consider a weather predicting system that matches Groundhog objects to Prediction
objects. It seems fairly straightforwardyou create the two classes, and use Groundhog
as the key and Prediction as the value: .
//: c11:Groundhog.java
// Looks plausible, but doesn't work as a HashMap key.
public class Groundhog {
protected int number;
public Groundhog(int n) { number = n; }
public String toString() {
return "Groundhog #" + number;
}
} ///:~
//: c11:Prediction.java
// Predicting the weather with groundhogs.
public class Prediction {
private boolean shadow = Math.random() > 0.5;
public String toString() {
if(shadow)
return "Six more weeks of Winter!";
else
return "Early Spring!";
}
} ///:~
//: c11:SpringDetector.java
// What will the weather be?
import com.bruceeckel.simpletest.*;
import java.util.*;
import java.lang.reflect.*;
public class SpringDetector {
private static Test monitor = new Test();
// Uses a Groundhog or class derived from Groundhog:
public static void
detectSpring(Class groundHogClass) throws Exception {
Constructor ghog = groundHogClass.getConstructor(
new Class[] {int.class});
Map map = new HashMap();
for(int i = 0; i < 10; i++)
map.put(ghog.newInstance(
new Object[]{ new Integer(i) }), new Prediction());
System.out.println("map = " + map + "\n");
Groundhog gh = (Groundhog)
ghog.newInstance(new Object[]{ new Integer(3) });
System.out.println("Looking up prediction for " + gh);
if(map.containsKey(gh))
System.out.println((Prediction)map.get(gh));
else
System.out.println("Key not found: " + gh);
}
public static void main(String[] args) throws Exception {
detectSpring(Groundhog.class);
monitor.expect(new String[] {
"%% map = \\{(Groundhog #\\d=" +
"(Early Spring!|Six more weeks of Winter!)" +
"(, )?){10}\\}",
"",
"Looking up prediction for Groundhog #3",
"Key not found: Groundhog #3"
});
}
} ///:~
Each Groundhog is given an identity number, so you can look up a Prediction
in the HashMap by saying, Give me the Prediction associated with Groundhog
#3. The Prediction class contains a boolean that is initialized using Math.random( )
and a toString( ) that interprets the result for you. The detectSpring( )
method is created using reflection to instantiate and use the Class Groundhog
or any derived class. This will come in handy when we inherit a new type of Groundhog
to solve the problem demonstrated here. A HashMap is filled with Groundhogs
and their associated Predictions. The HashMap is printed so that you can see
it has been filled. Then a Groundhog with an identity number of 3 is used as a key
to look up the prediction for Groundhog #3 (which you can see must be in the Map).
.
It seems simple enough, but it doesnt work. The problem is that Groundhog
is inherited from the common root class Object (which is what happens if you
dont specify a base class, thus all classes are ultimately inherited from Object).
It is Objects hashCode( ) method that is used to generate the
hash code for each object, and by default it just uses the address of its object. Thus,
the first instance of Groundhog(3) does not produce a hash code equal to the
hash code for the second instance of Groundhog(3) that we tried to use as a lookup.
.
You might think that all you need to do is write an appropriate override for hashCode( ). But it still wont work until
youve done one more thing: override the equals( )
that is also part of Object. equals( ) is used by the HashMap
when trying to determine if your key is equal to any of the keys in the table. .
A proper equals( ) must satisfy the following five
conditions: .
Again, the default Object.equals( ) simply compares object addresses, so
one Groundhog(3) is not equal to another Groundhog(3). Thus, to use your own
classes as keys in a HashMap, you must override both hashCode( ) and equals( ),
as shown in the following solution to the groundhog problem:
//: c11:Groundhog2.java
// A class that's used as a key in a HashMap
// must override hashCode() and equals().
public class Groundhog2 extends Groundhog {
public Groundhog2(int n) { super(n); }
public int hashCode() { return number; }
public boolean equals(Object o) {
return (o instanceof Groundhog2)
&& (number == ((Groundhog2)o).number);
}
} ///:~
//: c11:SpringDetector2.java
// A working key.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class SpringDetector2 {
private static Test monitor = new Test();
public static void main(String[] args) throws Exception {
SpringDetector.detectSpring(Groundhog2.class);
monitor.expect(new String[] {
"%% map = \\{(Groundhog #\\d=" +
"(Early Spring!|Six more weeks of Winter!)" +
"(, )?){10}\\}",
"",
"Looking up prediction for Groundhog #3",
"%% Early Spring!|Six more weeks of Winter!"
});
}
} ///:~
Groundhog2.hashCode( ) returns the groundhog number as a hash value. In
this example, the programmer is responsible for ensuring that no two groundhogs exist with
the same ID number. The hashCode( ) is not required to return a unique
identifier (something youll understand better later in this chapter), but the equals( )
method must be able to strictly determine whether two objects are equivalent. Here, equals( )
is based on the groundhog number, so if two Groundhog2 objects exist as keys in the
HashMap with the same groundhog number, it will fail. .
Even though it appears that the equals( ) method is only checking to see
whether the argument is an instance of Groundhog2 (using the instanceof
keyword, which was explained in Chapter 10), the instanceof actually quietly does a
second sanity check to see if the object is null, since instanceof produces false
if the left-hand argument is null. Assuming its the correct type and not null,
the comparison is based on the actual ghNumbers. You can see from the output that
the behavior is now correct. .
When creating your own class to use in a HashSet, you must pay attention to the
same issues as when it is used as a key in a HashMap. .
The preceding example is only a start toward solving the problem correctly. It shows
that if you do not override hashCode( ) and equals( ) for your key, the hashed data structure (HashSet,
HashMap, LinkedHashSet, or LinkedHashMap) will not be able to deal
with your key properly. However, to get a good solution for the problem you need to
understand whats going on inside the hashed data structure. .
First, consider the motivation behind hashing: you want to look up an object using
another object. But you can accomplish this with a TreeSet
or TreeMap, too. Its also possible to implement your own Map. To do
so, the Map.entrySet( ) method must be supplied to produce a set of Map.Entry
objects. MPair will be defined as the new type of Map.Entry.
In order for it to be placed in a TreeSet, it must implement equals( )
and be Comparable:
//: c11:MPair.java
// A new type of Map.Entry.
import java.util.*;
public class MPair implements Map.Entry, Comparable {
private Object key, value;
public MPair(Object k, Object v) {
key = k;
value = v;
}
public Object getKey() { return key; }
public Object getValue() { return value; }
public Object setValue(Object v) {
Object result = value;
value = v;
return result;
}
public boolean equals(Object o) {
return key.equals(((MPair)o).key);
}
public int compareTo(Object rv) {
return ((Comparable)key).compareTo(((MPair)rv).key);
}
} ///:~
Notice that the comparisons are only interested in the keys, so duplicate values are
perfectly acceptable. .
The following example implements a Map using a pair of ArrayLists: .
//: c11:SlowMap.java
// A Map implemented with ArrayLists.
import com.bruceeckel.simpletest.*;
import java.util.*;
import com.bruceeckel.util.*;
public class SlowMap extends AbstractMap {
private static Test monitor = new Test();
private List
keys = new ArrayList(),
values = new ArrayList();
public Object put(Object key, Object value) {
Object result = get(key);
if(!keys.contains(key)) {
keys.add(key);
values.add(value);
} else
values.set(keys.indexOf(key), value);
return result;
}
public Object get(Object key) {
if(!keys.contains(key))
return null;
return values.get(keys.indexOf(key));
}
public Set entrySet() {
Set entries = new HashSet();
Iterator
ki = keys.iterator(),
vi = values.iterator();
while(ki.hasNext())
entries.add(new MPair(ki.next(), vi.next()));
return entries;
}
public String toString() {
StringBuffer s = new StringBuffer("{");
Iterator
ki = keys.iterator(),
vi = values.iterator();
while(ki.hasNext()) {
s.append(ki.next() + "=" + vi.next());
if(ki.hasNext()) s.append(", ");
}
s.append("}");
return s.toString();
}
public static void main(String[] args) {
SlowMap m = new SlowMap();
Collections2.fill(m, Collections2.geography, 15);
System.out.println(m);
monitor.expect(new String[] {
"{ALGERIA=Algiers, ANGOLA=Luanda, BENIN=Porto-Novo,"+
" BOTSWANA=Gaberone, BURKINA FASO=Ouagadougou, " +
"BURUNDI=Bujumbura, CAMEROON=Yaounde, " +
"CAPE VERDE=Praia, CENTRAL AFRICAN REPUBLIC=Bangui,"+
" CHAD=N'djamena, COMOROS=Moroni, " +
"CONGO=Brazzaville, DJIBOUTI=Dijibouti, " +
"EGYPT=Cairo, EQUATORIAL GUINEA=Malabo}"
});
}
} ///:~
The put( ) method simply places the keys and values in corresponding ArrayLists. In main( ), a SlowMap is loaded and then printed to show that it works. .
This shows that its not that hard to produce a new type of Map. But as the
name suggests, a SlowMap isnt very fast, so you probably wouldnt use it
if you had an alternative available. The problem is in the lookup of the key; there is no
order, so a simple linear search is used, which is the slowest way to look something up. .
The whole point of hashing is speed: Hashing allows the lookup to happen quickly. Since
the bottleneck is in the speed of the key lookup, one of the solutions to the problem
could be to keep the keys sorted and then use Collections.binarySearch( ) to
perform the lookup (an exercise at the end of this chapter will walk you through this
process). .
Hashing goes further by saying that all you want to do is to store the key somewhere
so that it can be quickly found. As youve seen in this chapter, the fastest
structure in which to store a group of elements is an array, so that will be used for
representing the key information (note carefully that I said key information,
and not the key itself). Also seen in this chapter was the fact that an array, once
allocated, cannot be resized, so we have a problem: We want to be able to store any number
of values in the Map, but if the number of keys is fixed by the array size, how can
this be? .
The answer is that the array will not hold the keys. From the key object, a number will
be derived that will index into the array. This number is the hash
code, produced by the hashCode( ) method (in computer science parlance,
this is the hash function) defined in Object and
presumably overridden by your class. To solve the problem of the fixed-size array, more
than one key may produce the same index. That is, there may be collisions.
Because of this, it doesnt matter how big the array is; each key object will land
somewhere in that array. .
So the process of looking up a value starts by computing the hash code and using it to
index into the array. If you could guarantee that there were no collisions (which could be
possible if you have a fixed number of values) then youd have a perfect hashing function, but thats a special case. In
all other cases, collisions are handled by external chaining:
The array points not directly to a value, but instead to a list of values. These values
are searched in a linear fashion using the equals( ) method. Of course, this
aspect of the search is much slower, but if the hash function is good, there will only be
a few values in each slot. So instead of searching through the entire list, you quickly
jump to a slot where you have to compare a few entries to find the value. This is much
faster, which is why the HashMap is so quick. .
Knowing the basics of hashing, its possible to implement a simple hashed Map:
//: c11:SimpleHashMap.java
// A demonstration hashed Map.
import java.util.*;
import com.bruceeckel.util.*;
public class SimpleHashMap extends AbstractMap {
// Choose a prime number for the hash table
// size, to achieve a uniform distribution:
private static final int SZ = 997;
private LinkedList[] bucket = new LinkedList[SZ];
public Object put(Object key, Object value) {
Object result = null;
int index = key.hashCode() % SZ;
if(index < 0) index = -index;
if(bucket[index] == null)
bucket[index] = new LinkedList();
LinkedList pairs = bucket[index];
MPair pair = new MPair(key, value);
ListIterator it = pairs.listIterator();
boolean found = false;
while(it.hasNext()) {
Object iPair = it.next();
if(iPair.equals(pair)) {
result = ((MPair)iPair).getValue();
it.set(pair); // Replace old with new
found = true;
break;
}
}
if(!found)
bucket[index].add(pair);
return result;
}
public Object get(Object key) {
int index = key.hashCode() % SZ;
if(index < 0) index = -index;
if(bucket[index] == null) return null;
LinkedList pairs = bucket[index];
MPair match = new MPair(key, null);
ListIterator it = pairs.listIterator();
while(it.hasNext()) {
Object iPair = it.next();
if(iPair.equals(match))
return ((MPair)iPair).getValue();
}
return null;
}
public Set entrySet() {
Set entries = new HashSet();
for(int i = 0; i < bucket.length; i++) {
if(bucket[i] == null) continue;
Iterator it = bucket[i].iterator();
while(it.hasNext())
entries.add(it.next());
}
return entries;
}
public static void main(String[] args) {
SimpleHashMap m = new SimpleHashMap();
Collections2.fill(m, Collections2.geography, 25);
System.out.println(m);
}
} ///:~
Because the slots in a hash table are often referred to as buckets,
the array that represents the actual table is called bucket. To promote even
distribution, the number of buckets is typically a prime number.[59] Notice that it is an array of LinkedList, which
automatically provides for collisions; each new item is simply added to the end of the
list. .
The return value of put( ) is null or, if the key was already in the
list, the old value associated with that key. The return value is result, which is
initialized to null, but if a key is discovered in the list, then result is
assigned to that key. .
For both put( ) and get( ), the first thing that happens is
that the hashCode( ) is called for the key, and the result is forced to a
positive number. Then it is forced to fit into the bucket array using the modulus
operator and the size of the array. If that location is null, it means there are no
elements that hash to that location, so a new LinkedList is created to hold the
object that just did. However, the normal process is to look through the list to see if
there are duplicates, and if there are, the old value is put into result and the
new value replaces the old. The found flag keeps track of whether an old key-value
pair was found and, if not, the new pair is appended to the end of the list. .
In get( ), youll see very similar code as that contained in put( ),
but simpler. The index is calculated into the bucket array, and if a LinkedList
exists, it is searched for a match. .
entrySet( ) must find and traverse all the lists, adding them to the result
Set. Once this method has been created, the Map can be tested by filling it
with values and then printing them. .
To understand the issues, some terminology is necessary:
Capacity: The number of
buckets in the table.
Initial capacity: The number
of buckets when the table is created. HashMap and HashSet have constructors
that allow you to specify the initial capacity.
Size: The number of entries
currently in the table.
Load factor: size/capacity. A
load factor of 0 is an empty table, 0.5 is a half-full table, etc. A lightly loaded table
will have few collisions and so is optimal for insertions and lookups (but will slow down
the process of traversing with an iterator). HashMap and HashSet have
constructors that allow you to specify the load factor, which means that when this load
factor is reached, the container will automatically increase the capacity (the number of
buckets) by roughly doubling it and will redistribute the existing objects into the new
set of buckets (this is called rehashing). .
The default load factor used by HashMap is 0.75 (it doesnt rehash until
the table is ¾ full). This seems to be a good trade-off between time and space costs. A
higher load factor decreases the space required by the table but increases the lookup
cost, which is important because lookup is what you do most of the time (including both get( )
and put( )). .
If you know that youll be storing many entries in a HashMap, creating it
with an appropriately large initial capacity will prevent the overhead of automatic
rehashing.[60] .
Now that you understand whats involved in the function of the HashMap, the
issues involved in writing a hashCode( ) will make
more sense. .
First of all, you dont have control of the creation of the actual value
thats used to index into the array of buckets. That is dependent on the capacity of
the particular HashMap object, and that capacity changes depending on how full the
container is, and what the load factor is. The value produced by your hashCode( )
will be further processed in order to create the bucket index (in SimpleHashMap,
the calculation is just a modulo by the size of the bucket array). .
The most important factor in creating a hashCode( ) is that, regardless of
when hashCode( ) is called, it produces the same value for a particular object
every time it is called. If you end up with an object that produces one hashCode( )
value when it is put( ) into a HashMap and another during a get( ),
you wont be able to retrieve the objects. So if your hashCode( ) depends
on mutable data in the object, the user must be made aware that changing the data will
effectively produce a different key by generating a different hashCode( ). .
In addition, you will probably not want to generate a hashCode( ) that
is based on unique object informationin particular, the value of this makes a
bad hashCode( ) because then you cant generate a new identical key to
the one used to put( ) the original key-value pair. This was the problem that
occurred in SpringDetector.java, because the default implementation of hashCode( )
does use the object address. So youll want to use information in the
object that identifies the object in a meaningful way. .
One example can be seen in the String class. Strings have the special
characteristic that if a program has several String objects that contain identical
character sequences, then those String objects all map to the same memory (the
mechanism for this is described in Appendix A). So it makes sense that the hashCode( )
produced by two separate instances of new String(hello) should be
identical. You can see this in the following program:
//: c11:StringHashCode.java
import com.bruceeckel.simpletest.*;
public class StringHashCode {
private static Test monitor = new Test();
public static void main(String[] args) {
System.out.println("Hello".hashCode());
System.out.println("Hello".hashCode());
monitor.expect(new String[] {
"69609650",
"69609650"
});
}
} ///:~
The hashCode( ) for String is clearly based on the contents of the String.
.
So for a hashCode( ) to be effective, it must be fast and it must be
meaningful; that is, it must generate a value based on the contents of the object.
Remember that this value doesnt have to be uniqueyou should lean toward speed
rather than uniquenessbut between hashCode( ) and equals( ),
the identity of the object must be completely resolved. .
Because the hashCode( ) is further processed before the bucket index is
produced, the range of values is not important; it just needs to generate an int. .
Theres one other factor: A good hashCode( ) should result in an even
distribution of values. If the values tend to cluster, then the HashMap or HashSet
will be more heavily loaded in some areas and will not be as fast as it could be with
an evenly distributed hashing function. .
In Effective Java (Addison-Wesley 2001), Joshua Bloch gives a basic recipe for
generating a decent hashCode( ):
| Field
type |
Calculation |
| boolean |
c = (f ? 0 : 1) |
| byte, char, short, or
int |
c = (int)f |
| long |
c = (int)(f ^ (f
>>>32)) |
| float |
c =
Float.floatToIntBits(f); |
| double |
long l =
Double.doubleToLongBits(f); c = (int)(l ^ (l >>> 32)) |
| Object, where
equals( ) calls equals( ) for this field |
c = f.hashCode( ) |
| Array |
Apply above rules to each
element |
Heres an example that follows these guidelines:
//: c11:CountedString.java
// Creating a good hashCode().
import com.bruceeckel.simpletest.*;
import java.util.*;
public class CountedString {
private static Test monitor = new Test();
private static List created = new ArrayList();
private String s;
private int id = 0;
public CountedString(String str) {
s = str;
created.add(s);
Iterator it = created.iterator();
// Id is the total number of instances
// of this string in use by CountedString:
while(it.hasNext())
if(it.next().equals(s))
id++;
}
public String toString() {
return "String: " + s + " id: " + id +
" hashCode(): " + hashCode();
}
public int hashCode() {
// Very simple approach:
// return s.hashCode() * id;
// Using Joshua Bloch's recipe:
int result = 17;
result = 37*result + s.hashCode();
result = 37*result + id;
return result;
}
public boolean equals(Object o) {
return (o instanceof CountedString)
&& s.equals(((CountedString)o).s)
&& id == ((CountedString)o).id;
}
public static void main(String[] args) {
Map map = new HashMap();
CountedString[] cs = new CountedString[10];
for(int i = 0; i < cs.length; i++) {
cs[i] = new CountedString("hi");
map.put(cs[i], new Integer(i));
}
System.out.println(map);
for(int i = 0; i < cs.length; i++) {
System.out.println("Looking up " + cs[i]);
System.out.println(map.get(cs[i]));
}
monitor.expect(new String[] {
"{String: hi id: 4 hashCode(): 146450=3," +
" String: hi id: 10 hashCode(): 146456=9," +
" String: hi id: 6 hashCode(): 146452=5," +
" String: hi id: 1 hashCode(): 146447=0," +
" String: hi id: 9 hashCode(): 146455=8," +
" String: hi id: 8 hashCode(): 146454=7," +
" String: hi id: 3 hashCode(): 146449=2," +
" String: hi id: 5 hashCode(): 146451=4," +
" String: hi id: 7 hashCode(): 146453=6," +
" String: hi id: 2 hashCode(): 146448=1}",
"Looking up String: hi id: 1 hashCode(): 146447",
"0",
"Looking up String: hi id: 2 hashCode(): 146448",
"1",
"Looking up String: hi id: 3 hashCode(): 146449",
"2",
"Looking up String: hi id: 4 hashCode(): 146450",
"3",
"Looking up String: hi id: 5 hashCode(): 146451",
"4",
"Looking up String: hi id: 6 hashCode(): 146452",
"5",
"Looking up String: hi id: 7 hashCode(): 146453",
"6",
"Looking up String: hi id: 8 hashCode(): 146454",
"7",
"Looking up String: hi id: 9 hashCode(): 146455",
"8",
"Looking up String: hi id: 10 hashCode(): 146456",
"9"
});
}
} ///:~
CountedString includes a String and an id that represents the
number of CountedString objects that contain an identical String. The
counting is accomplished in the constructor by iterating through the static ArrayList
where all the Strings are stored. .
Both hashCode( ) and equals( ) produce results based on both
fields; if they were just based on the String alone or the id alone, there
would be duplicate matches for distinct values. .
In main( ), a bunch of CountedString objects are created, using the
same String to show that the duplicates create unique values because of the count id.
The HashMap is displayed so that you can see how it is stored internally (no
discernible orders), and then each key is looked up individually to demonstrate that the
lookup mechanism is working properly. .
Writing a proper hashCode( ) and equals( ) for a new class can
be tricky. You can find tools to help you do this in Apaches Jakarta
Commons project at jakarta.apache.org/commons, under lang (this
project also has many other potentially useful libraries, and appears to be the Java
communitys answer to the C++ communitys www.boost.org).
The java.lang.ref library contains a set of classes that allow greater
flexibility in garbage collection. These classes are especially useful when you have large
objects that may cause memory exhaustion. There are three classes inherited from the
abstract class Reference: SoftReference,
WeakReference, and PhantomReference.
Each of these provides a different level of indirection for the garbage collector if the
object in question is only reachable through one of these Reference objects.
.
If an object is reachable, it
means that somewhere in your program the object can be found. This could mean that you
have an ordinary reference on the stack that goes right to the object, but you might also
have a reference to an object that has a reference to the object in question; there could
be many intermediate links. If an object is reachable, the garbage collector cannot
release it because its still in use by your program. If an object isnt
reachable, theres no way for your program to use it, so its safe to garbage
collect that object. .
You use Reference objects when you want to continue to hold onto a reference to
that object; you want to be able to reach that object, but you also want to allow the
garbage collector to release that object. Thus, you have a way to go on using the object,
but if memory exhaustion is imminent, you allow that object to be released. .
You accomplish this by using a Reference object as an
intermediary between you and the ordinary reference, and there must be no ordinary
references to the object (ones that are not wrapped inside Reference objects). If
the garbage collector discovers that an object is reachable through an ordinary reference,
it will not release that object. .
In the order of SoftReference, WeakReference, and PhantomReference,
each one is weaker than the last and corresponds to a different level of
reachability. Soft references are for implementing memory-sensitive caches. Weak
references are for implementing canonicalizing mappingswhere instances
of objects can be simultaneously used in multiple places in a program, to save
storagethat do not prevent their keys (or values) from being reclaimed. Phantom
references are for scheduling premortem cleanup actions in a more flexible way than is
possible with the Java finalization mechanism. .
With SoftReferences and WeakReferences, you have a choice about whether
to place them on a ReferenceQueue (the device used for premortem cleanup actions),
but a PhantomReference can only be built on a ReferenceQueue. Heres a
simple demonstration: .
//: c11:References.java
// Demonstrates Reference objects
import java.lang.ref.*;
class VeryBig {
private static final int SZ = 10000;
private double[] d = new double[SZ];
private String ident;
public VeryBig(String id) { ident = id; }
public String toString() { return ident; }
public void finalize() {
System.out.println("Finalizing " + ident);
}
}
public class References {
private static ReferenceQueue rq = new ReferenceQueue();
public static void checkQueue() {
Object inq = rq.poll();
if(inq != null)
System.out.println("In queue: " +
(VeryBig)((Reference)inq).get());
}
public static void main(String[] args) {
int size = 10;
// Or, choose size via the command line:
if(args.length > 0)
size = Integer.parseInt(args[0]);
SoftReference[] sa = new SoftReference[size];
for(int i = 0; i < sa.length; i++) {
sa[i] = new SoftReference(
new VeryBig("Soft " + i), rq);
System.out.println("Just created: " +
(VeryBig)sa[i].get());
checkQueue();
}
WeakReference[] wa = new WeakReference[size];
for(int i = 0; i < wa.length; i++) {
wa[i] = new WeakReference(
new VeryBig("Weak " + i), rq);
System.out.println("Just created: " +
(VeryBig)wa[i].get());
checkQueue();
}
SoftReference s =
new SoftReference(new VeryBig("Soft"));
WeakReference w =
new WeakReference(new VeryBig("Weak"));
System.gc();
PhantomReference[] pa = new PhantomReference[size];
for(int i = 0; i < pa.length; i++) {
pa[i] = new PhantomReference(
new VeryBig("Phantom " + i), rq);
System.out.println("Just created: " +
(VeryBig)pa[i].get());
checkQueue();
}
}
} ///:~
When you run this program (youll want to pipe the output through a
more utility so that you can view the output in pages), youll see that
the objects are garbage collected, even though you still have access to them through the Reference
object (to get the actual object reference, you use get( )). Youll also
see that the ReferenceQueue always produces a Reference containing a null
object. To make use of this, you can inherit from the particular Reference class
youre interested in and add more useful methods to the new type of Reference.
.
The containers library has a special Map to hold weak references: the WeakHashMap. This class is designed to make the creation of
canonicalized mappings easier. In such a mapping, you are saving storage by making only
one instance of a particular value. When the program needs that value, it looks up the
existing object in the mapping and uses that (rather than creating one from scratch). The
mapping may make the values as part of its initialization, but its more likely that
the values are made on demand. .
Since this is a storage-saving technique, its very convenient that the WeakHashMap
allows the garbage collector to automatically clean up the keys and values. You dont
have to do anything special to the keys and values you want to place in the WeakHashMap;
these are automatically wrapped in WeakReferences by the map. The trigger to allow
cleanup is if the key is no longer in use, as demonstrated here: .
//: c11:CanonicalMapping.java
// Demonstrates WeakHashMap.
import java.util.*;
import java.lang.ref.*;
class Key {
private String ident;
public Key(String id) { ident = id; }
public String toString() { return ident; }
public int hashCode() { return ident.hashCode(); }
public boolean equals(Object r) {
return (r instanceof Key)
&& ident.equals(((Key)r).ident);
}
public void finalize() {
System.out.println("Finalizing Key "+ ident);
}
}
class Value {
private String ident;
public Value(String id) { ident = id; }
public String toString() { return ident; }
public void finalize() {
System.out.println("Finalizing Value " + ident);
}
}
public class CanonicalMapping {
public static void main(String[] args) {
int size = 1000;
// Or, choose size via the command line:
if(args.length > 0)
size = Integer.parseInt(args[0]);
Key[] keys = new Key[size];
WeakHashMap map = new WeakHashMap();
for(int i = 0; i < size; i++) {
Key k = new Key(Integer.toString(i));
Value v = new Value(Integer.toString(i));
if(i % 3 == 0)
keys[i] = k; // Save as "real" references
map.put(k, v);
}
System.gc();
}
} ///:~
The Key class must have a hashCode( ) and an equals( )
since it is being used as a key in a hashed data structure, as described previously in
this chapter. .
When you run the program, youll see that the garbage collector will skip every
third key, because an ordinary reference to that key has also been placed in the keys
array, and thus those objects cannot be garbage collected. .
We can now demonstrate the true power of the Iterator:
the ability to separate the operation of traversing a sequence from the underlying
structure of that sequence. The class PrintData (defined earlier in the chapter)
uses an Iterator to move through a sequence and call the toString( )
method for every object. In the following example, two different types of containers are
createdan ArrayList and a HashMapand
they are each filled with, respectively, Mouse and Hamster objects. (These
classes are defined earlier in this chapter.) Because an Iterator hides the
structure of the underlying container, Printer.printAll( ) doesnt know
or care what kind of container the Iterator comes from: .
//: c11:Iterators2.java
// Revisiting Iterators.
import com.bruceeckel.simpletest.*;
import java.util.*;
public class Iterators2 {
private static Test monitor = new Test();
public static void main(String[] args) {
List list = new ArrayList();
for(int i = 0; i < 5; i++)
list.add(new Mouse(i));
Map m = new HashMap();
for(int i = 0; i < 5; i++)
m.put(new Integer(i), new Hamster(i));
System.out.println("List");
Printer.printAll(list.iterator());
System.out.println("Map");
Printer.printAll(m.entrySet().iterator());
monitor.expect(new String[] {
"List",
"This is Mouse #0",
"This is Mouse #1",
"This is Mouse #2",
"This is Mouse #3",
"This is Mouse #4",
"Map",
"4=This is Hamster #4",
"3=This is Hamster #3",
"2=This is Hamster #2",
"1=This is Hamster #1",
"0=This is Hamster #0"
}, Test.IGNORE_ORDER);
}
} ///:~
For the HashMap, the entrySet( ) method produces a Set of Map.entry
objects, which contain both the key and the value for each entry, so you see both of them
printed. .
Note that PrintData.print( ) takes advantage of the fact that the objects
in these containers are of class Object so the call toString( ) by System.out.println( )
is automatic. Its more likely that in your problem, you must make the assumption
that your Iterator is walking through a container of some specific type. For
example, you might assume that everything in the container is a Shape with a draw( )
method. Then you must downcast from the Object that Iterator.next( )
returns to produce a Shape. .
By now you should understand that there are really only three container components: Map,
List, and Set, but more than one implementation of each interface. If you
need to use the functionality offered by a particular interface, how do you decide which
implementation to use? .
To understand the answer, you must be aware that each different implementation has its
own features, strengths, and weaknesses. For example, you can see in the diagram that the
feature of Hashtable, Vector, and Stack is that they are
legacy classes, so that old code doesnt break. On the other hand, its best if
you dont use those for new code. .
The distinction between the other containers often comes down to what they are
backed by; that is, the data structures that physically implement your desired
interface. This means that, for example, ArrayList and LinkedList implement
the List interface, so the basic operations are the same regardless of which one
you use. However, ArrayList is backed by an array, and LinkedList is
implemented in the usual way for a doubly linked list, as individual objects each
containing data along with references to the previous and next elements in the list.
Because of this, if you want to do many insertions and removals in the middle of a list, a
LinkedList is the appropriate choice. (LinkedList also has additional
functionality that is established in AbstractSequentialList.)
If not, an ArrayList is typically faster. .
As another example, a Set can be implemented as either a TreeSet, a HashSet,
or a LinkedHashSet. Each of these have different behaviors: HashSet is for
typical use and provides raw speed on lookup, LinkedHashSet keeps pairs in
insertion order, and TreeSet is backed by TreeMap and is designed to produce
a constantly sorted set. The idea is that you can choose the implementation based on the
behavior you need. Most of the time, the HashSet is all thats necessary and
should be your default choice of Set. .
The most convincing way to see the differences between the implementations of List
is with a performance test. The following code establishes an inner base class to use as a
test framework, then creates an array of anonymous inner classes, one for each different
test. Each of these inner classes is called by the test( ) method. This approach
allows you to easily add and remove new kinds of tests. .
//: c11:ListPerformance.java
// Demonstrates performance differences in Lists.
// {Args: 500}
import java.util.*;
import com.bruceeckel.util.*;
public class ListPerformance {
private static int reps = 10000;
private static int quantity = reps / 10;
private abstract static class Tester {
private String name;
Tester(String name) { this.name = name; }
abstract void test(List a);
}
private static Tester[] tests = {
new Tester("get") {
void test(List a) {
for(int i = 0; i < reps; i++) {
for(int j = 0; j < quantity; j++)
a.get(j);
}
}
},
new Tester("iteration") {
void test(List a) {
for(int i = 0; i < reps; i++) {
Iterator it = a.iterator();
while(it.hasNext())
it.next();
}
}
},
new Tester("insert") {
void test(List a) {
int half = a.size()/2;
String s = "test";
ListIterator it = a.listIterator(half);
for(int i = 0; i < reps * 10; i++)
it.add(s);
}
},
new Tester("remove") {
void test(List a) {
ListIterator it = a.listIterator(3);
while(it.hasNext()) {
it.next();
it.remove();
}
}
},
};
public static void test(List a) {
// Strip qualifiers from class name:
System.out.println("Testing " +
a.getClass().getName().replaceAll("\\w+\\.", ""));
for(int i = 0; i < tests.length; i++) {
Collections2.fill(a, Collections2.countries.reset(),
quantity);
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(a);
long t2 = System.currentTimeMillis();
System.out.println(": " + (t2 - t1));
}
}
public static void testArrayAsList(int reps) {
System.out.println("Testing array as List");
// Can only do first two tests on an array:
for(int i = 0; i < 2; i++) {
String[] sa = new String[quantity];
Arrays2.fill(sa, Collections2.countries.reset());
List a = Arrays.asList(sa);
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(a);
long t2 = System.currentTimeMillis();
System.out.println(": " + (t2 - t1));
}
}
public static void main(String[] args) {
// Choose a different number of
// repetitions via the command line:
if(args.length > 0)
reps = Integer.parseInt(args[0]);
System.out.println(reps + " repetitions");
testArrayAsList(reps);
test(new ArrayList());
test(new LinkedList());
test(new Vector());
}
} ///:~
To provide a base class for the specific tests, the inner class Tester is abstract.
It contains a String to be printed when the test starts and an abstract
method test( ) that does the work. All the different types of tests are
collected in one place, the array tests, which is initialized with different
anonymous inner classes that inherit from Tester. To add or remove tests, simply
add or remove an inner class definition from the array, and everything else happens
automatically. .
To compare array access to container access (primarily against ArrayList), a
special test is created for arrays by wrapping one as a List using Arrays.asList( ).
Note that only the first two tests can be performed in this case, because you cannot
insert or remove elements from an array. .
The List thats handed to test( ) is first filled with
elements, then each test in the tests array is timed. The results will vary from
machine to machine; they are intended to give only an order of magnitude comparison
between the performance of the different containers. Here is a summary of one run: .
| Type |
Get |
Iteration |
Insert |
Remove |
| array |
172 |
516 |
na |
na |
| ArrayList |
281 |
1375 |
328 |
30484 |
| LinkedList |
5828 |
1047 |
109 |
16 |
| Vector |
422 |
1890 |
360 |
30781 |
As expected, arrays are faster than any container for
random-access lookups and iteration. You can see that random accesses (get( ))
are cheap for ArrayLists and expensive for LinkedLists. (Oddly, iteration is
faster for a LinkedList than an ArrayList,
which is a bit counterintuitive.) On the other hand, insertions and removals from the
middle of a list are dramatically cheaper for a LinkedList than for an ArrayListespecially
removals. Vector is generally not as fast as ArrayList,
and it should be avoided; its only in the library for legacy code support (the only
reason it works in this program is because it was adapted to be a List in Java 2).
The best approach is probably to choose an ArrayList as your default and to
change to a LinkedList if you discover performance problems due to many insertions
and removals from the middle of the list. And of course, if you are working with a
fixed-sized group of elements, use an array. .
You can choose a TreeSet, a HashSet,
or a LinkedHashSet depending on the behavior you desire. The following test
program gives an indication of the performance trade-off between the implementations: .
//: c11:SetPerformance.java
// {Args: 500}
import java.util.*;
import com.bruceeckel.util.*;
public class SetPerformance {
private static int reps = 50000;
private abstract static class Tester {
String name;
Tester(String name) { this.name = name; }
abstract void test(Set s, int size);
}
private static Tester[] tests = {
new Tester("add") {
void test(Set s, int size) {
for(int i = 0; i < reps; i++) {
s.clear();
Collections2.fill(s,
Collections2.countries.reset(),size);
}
}
},
new Tester("contains") {
void test(Set s, int size) {
for(int i = 0; i < reps; i++)
for(int j = 0; j < size; j++)
s.contains(Integer.toString(j));
}
},
new Tester("iteration") {
void test(Set s, int size) {
for(int i = 0; i < reps * 10; i++) {
Iterator it = s.iterator();
while(it.hasNext())
it.next();
}
}
},
};
public static void test(Set s, int size) {
// Strip qualifiers from class name:
System.out.println("Testing " +
s.getClass().getName().replaceAll("\\w+\\.", "") +
" size " + size);
Collections2.fill(s,
Collections2.countries.reset(), size);
for(int i = 0; i < tests.length; i++) {
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(s, size);
long t2 = System.currentTimeMillis();
System.out.println(": " +
((double)(t2 - t1)/(double)size));
}
}
public static void main(String[] args) {
// Choose a different number of
// repetitions via the command line:
if(args.length > 0)
reps = Integer.parseInt(args[0]);
System.out.println(reps + " repetitions");
// Small:
test(new TreeSet(), 10);
test(new HashSet(), 10);
test(new LinkedHashSet(), 10);
// Medium:
test(new TreeSet(), 100);
test(new HashSet(), 100);
test(new LinkedHashSet(), 100);
// Large:
test(new TreeSet(), 1000);
test(new HashSet(), 1000);
test(new LinkedHashSet(), 1000);
}
} ///:~
The following table shows the results of one run. (Of course, this will be different
according to the computer and JVM you are using; you should run the test yourself as
well):
| Type |
Test size |
Add |
Contains |
Iteration |
| 10 |
25.0 |
23.4 |
39.1 |
|
| TreeSet |
100 |
17.2 |
27.5 |
45.9 |
| 1000 |
26.0 |
30.2 |
9.0 |
|
| 10 |
18.7 |
17.2 |
64.1 |
|
| HashSet |
100 |
17.2 |
19.1 |
65.2 |
| 1000 |
8.8 |
16.6 |
12.8 |
|
| 10 |
20.3 |
18.7 |
64.1 |
|
| LinkedHashSet
|
100 |
18.6 |
19.5 |
49.2 |
| 1000 |
10.0 |
16.3 |
10.0 |
The performance of HashSet is generally superior to TreeSet
for all operations (but in particular for addition and lookup, the two most important
operations). The only reason TreeSet exists is because it maintains its elements in
sorted order, so you use it only when you need a sorted Set. .
Note that LinkedHashSet is
slightly more expensive for insertions than HashSet; this is because of the extra
cost of maintaining the linked list along with the hashed container. However, traversal is
cheaper with LinkedHashSet because of the linked list. .
When choosing between implementations of Map, the size
of the Map is what most strongly affects performance, and the following test
program gives an indication of this trade-off: .
//: c11:MapPerformance.java
// Demonstrates performance differences in Maps.
// {Args: 500}
import java.util.*;
import com.bruceeckel.util.*;
public class MapPerformance {
private static int reps = 50000;
private abstract static class Tester {
String name;
Tester(String name) { this.name = name; }
abstract void test(Map m, int size);
}
private static Tester[] tests = {
new Tester("put") {
void test(Map m, int size) {
for(int i = 0; i < reps; i++) {
m.clear();
Collections2.fill(m,
Collections2.geography.reset(), size);
}
}
},
new Tester("get") {
void test(Map m, int size) {
for(int i = 0; i < reps; i++)
for(int j = 0; j < size; j++)
m.get(Integer.toString(j));
}
},
new Tester("iteration") {
void test(Map m, int size) {
for(int i = 0; i < reps * 10; i++) {
Iterator it = m.entrySet().iterator();
while(it.hasNext())
it.next();
}
}
},
};
public static void test(Map m, int size) {
// Strip qualifiers from class name:
System.out.println("Testing " +
m.getClass().getName().replaceAll("\\w+\\.", "") +
" size " + size);
Collections2.fill(m,
Collections2.geography.reset(), size);
for(int i = 0; i < tests.length; i++) {
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(m, size);
long t2 = System.currentTimeMillis();
System.out.println(": " +
((double)(t2 - t1)/(double)size));
}
}
public static void main(String[] args) {
// Choose a different number of
// repetitions via the command line:
if(args.length > 0)
reps = Integer.parseInt(args[0]);
System.out.println(reps + " repetitions");
// Small:
test(new TreeMap(), 10);
test(new HashMap(), 10);
test(new LinkedHashMap(), 10);
test(new IdentityHashMap(), 10);
test(new WeakHashMap(), 10);
test(new Hashtable(), 10);
// Medium:
test(new TreeMap(), 100);
test(new HashMap(), 100);
test(new LinkedHashMap(), 100);
test(new IdentityHashMap(), 100);
test(new WeakHashMap(), 100);
test(new Hashtable(), 100);
// Large:
test(new TreeMap(), 1000);
test(new HashMap(), 1000);
test(new LinkedHashMap(), 1000);
test(new IdentityHashMap(), 1000);
test(new WeakHashMap(), 1000);
test(new Hashtable(), 1000);
}
} ///:~
Because the size of the map is the issue, youll see that the timing tests divide
the time by the size to normalize each measurement. Here is one set of results. (Yours
will probably be different.)
| Type |
Test size |
Put |
Get |
Iteration |
|---|---|---|---|---|
| 10 |
26.6 |
20.3 |
43.7 |
|
| TreeMap |
100 |
34.1 |
27.2 |
45.8 |
| 1000 |
27.8 |
29.3 |
8.8 |
|
|
|
10 |
21.9 |
18.8 |
60.9 |
| HashMap |
100 |
21.9 |
18.6 |
63.3 |
| 1000 |
11.5 |
18.8 |
12.3 |
|
|
|
10 |
23.4 |
18.8 |
59.4 |
| LinkedHashMap |
100 |
24.2 |
19.5 |
47.8 |
| 1000 |
12.3 |
19.0 |
9.2 |
|
|
|
10 |
20.3 |
25.0 |
71.9 |
| IdentityHashMap |
100 |
19.7 |
25.9 |
56.7 |
| 1000 |
13.1 |
24.3 |
10.9 |
|
|
|
10 |
26.6 |
18.8 |
76.5 |
| WeakHashMap |
100 |
26.1 |
21.6 |
64.4 |
| 1000 |
14.7 |
19.2 |
12.4 |
|
|
|
10 |
18.8 |
18.7 |
65.7 |
| Hashtable |
100 |
19.4 |
20.9 |
55.3 |
| 1000 |
13.1 |
19.9 |
10.8 |
As you might expect, Hashtable
performance is roughly equivalent to HashMap. (You can also see that HashMap
is generally a bit faster; HashMap is intended to replace Hashtable.) The TreeMap is generally slower than the HashMap, so why
would you use it? As a way to create an ordered list. The behavior of a tree is such that
its always in order and doesnt have to be specially sorted. Once you fill a TreeMap,
you can call keySet( ) to get a Set view of the
keys, then toArray( ) to produce an array of those
keys. You can then use the static method Arrays.binarySearch( )
(discussed later) to rapidly find objects in your sorted array. Of course, you would
probably only do this if, for some reason, the behavior of a HashMap was
unacceptable, since HashMap is designed to rapidly find things. Also, you can
easily create a HashMap from a TreeMap with a single object creation. In the
end, when youre using a Map, your first choice should be HashMap,
and only if you need a constantly sorted Map will you need TreeMap. .
LinkedHashMap is slightly slower
than HashMap because it maintains the linked list in addition to the hashed data
structure. IdentityHashMap has different performance because it uses ==
rather than equals( ) for comparisons. .
Utilities to perform sorting and
searching for Lists have the same names and signatures as
those for sorting arrays of objects, but are static methods of Collections
instead of Arrays. Heres an example, modified from ArraySearching.java:
//: c11:ListSortSearch.java
// Sorting and searching Lists with 'Collections.'
import com.bruceeckel.util.*;
import java.util.*;
public class ListSortSearch {
public static void main(String[] args) {
List list = new ArrayList();
Collections2.fill(list, Collections2.capitals, 25);
System.out.println(list + "\n");
Collections.shuffle(list);
System.out.println("After shuffling: " + list);
Collections.sort(list);
System.out.println(list + "\n");
Object key = list.get(12);
int index = Collections.binarySearch(list, key);
System.out.println("Location of " + key +
" is " + index + ", list.get(" +
index + ") = " + list.get(index));
AlphabeticComparator comp = new AlphabeticComparator();
Collections.sort(list, comp);
System.out.println(list + "\n");
key = list.get(12);
index = Collections.binarySearch(list, key, comp);
System.out.println("Location of " + key +
" is " + index + ", list.get(" +
index + ") = " + list.get(index));
}
} ///:~
The use of these methods is identical to the ones in Arrays, but youre
using a List instead of an array. Just like searching and sorting with arrays, if
you sort using a Comparator, you must binarySearch( ) using the same Comparator.
.
This program also demonstrates the shuffle( )
method in Collections, which randomizes the order of a List. .
There are a number of other useful utilities in the Collections class:
|
min(Collection) |
Produces the
maximum or minimum element in the argument using the natural comparison method of the
objects in the Collection. |
| max(Collection,
Comparator) min(Collection, Comparator) |
Produces the
maximum or minimum element in the Collection using the Comparator. |
| indexOfSubList(List
source, List target) |
Produces
starting index of the first place where target appears inside source. |
| lastIndexOfSubList(List
source, List target) |
Produces
starting index of the last place where target appears inside source. |
| replaceAll(List
list, |
Replace all oldVal
with newVal. |
| reverse( ) |
Reverses all
the elements in place. |
| rotate(List
list, int distance) |
Moves all
elements forward by distance, taking the ones off the end and placing them at the
beginning. |
| copy(List
dest, List src) |
Copies
elements from src to dest. |
| swap(List
list, int i, int j) |
Swaps
elements at locations i and j in list. Probably faster than what
youd write by hand. |
| fill(List
list, Object o) |
Replaces all
the elements of list with o. |
| nCopies(int
n, Object o) |
Returns an
immutable List of size n whose references all point to o. |
| enumeration(Collection)
|
Produces an
old-style Enumeration for the argument. |
| list(Enumeration
e) |
Returns an ArrayList
generated using the Enumeration. For converting from legacy code. |
Note that min( ) and max( ) work with Collection
objects, not with Lists, so you dont need to worry about whether the Collection
should be sorted or not. (As mentioned earlier, you do need to sort( )
a List or an array before performing a binarySearch( ).) .
//: c11:Utilities.java
// Simple demonstrations of the Collections utilities.
import com.bruceeckel.simpletest.*;
import java.util.*;
import com.bruceeckel.util.*;
public class Utilities {
private static Test monitor = new Test();
public static void main(String[] args) {
List list = Arrays.asList(
"one Two three Four five six one".split(" "));
System.out.println(list);
System.out.println("max: " + Collections.max(list));
System.out.println("min: " + Collections.min(list));
AlphabeticComparator comp = new AlphabeticComparator();
System.out.println("max w/ comparator: " +
Collections.max(list, comp));
System.out.println("min w/ comparator: " +
Collections.min(list, comp));
List sublist =
Arrays.asList("Four five six".split(" "));
System.out.println("indexOfSubList: " +
Collections.indexOfSubList(list, sublist));
System.out.println("lastIndexOfSubList: " +
Collections.lastIndexOfSubList(list, sublist));
Collections.replaceAll(list, "one", "Yo");
System.out.println("replaceAll: " + list);
Collections.reverse(list);
System.out.println("reverse: " + list);
Collections.rotate(list, 3);
System.out.println("rotate: " + list);
List source =
Arrays.asList("in the matrix".split(" "));
Collections.copy(list, source);
System.out.println("copy: " + list);
Collections.swap(list, 0, list.size() - 1);
System.out.println("swap: " + list);
Collections.fill(list, "pop");
System.out.println("fill: " + list);
List dups = Collections.nCopies(3, "snap");
System.out.println("dups: " + dups);
// Getting an old-style Enumeration:
Enumeration e = Collections.enumeration(dups);
Vector v = new Vector();
while(e.hasMoreElements())
v.addElement(e.nextElement());
// Converting an old-style Vector
// to a List via an Enumeration:
ArrayList arrayList = Collections.list(v.elements());
System.out.println("arrayList: " + arrayList);
monitor.expect(new String[] {
"[one, Two, three, Four, five, six, one]",
"max: three",
"min: Four",
"max w/ comparator: Two",
"min w/ comparator: five",
"indexOfSubList: 3",
"lastIndexOfSubList: 3",
"replaceAll: [Yo, Two, three, Four, five, six, Yo]",
"reverse: [Yo, six, five, Four, three, Two, Yo]",
"rotate: [three, Two, Yo, Yo, six, five, Four]",
"copy: [in, the, matrix, Yo, six, five, Four]",
"swap: [Four, the, matrix, Yo, six, five, in]",
"fill: [pop, pop, pop, pop, pop, pop, pop]",
"dups: [snap, snap, snap]",
"arrayList: [snap, snap, snap]"
});
}
} ///:~
The output explains the behavior of each utility method. Note the difference in min( )
and max( ) with the AlphabeticComparator because of capitalization. .
Often it is convenient to create a read-only version of a Collection
or Map. The Collections class allows you to do this by passing the original
container into a method that hands back a read-only version. There are four variations on
this method, one each for Collection (if you cant treat a Collection as
a more specific type), List, Set, and Map. This example shows the
proper way to build read-only versions of each:
//: c11:ReadOnly.java
// Using the Collections.unmodifiable methods.
import java.util.*;
import com.bruceeckel.util.*;
public class ReadOnly {
private static Collections2.StringGenerator gen =
Collections2.countries;
public static void main(String[] args) {
Collection c = new ArrayList();
Collections2.fill(c, gen, 25); // Insert data
c = Collections.unmodifiableCollection(c);
System.out.println(c); // Reading is OK
//! c.add("one"); // Can't change it
List a = new ArrayList();
Collections2.fill(a, gen.reset(), 25);
a = Collections.unmodifiableList(a);
ListIterator lit = a.listIterator();
System.out.println(lit.next()); // Reading is OK
//! lit.add("one"); // Can't change it
Set s = new HashSet();
Collections2.fill(s, gen.reset(), 25);
s = Collections.unmodifiableSet(s);
System.out.println(s); // Reading is OK
//! s.add("one"); // Can't change it
Map m = new HashMap();
Collections2.fill(m, Collections2.geography, 25);
m = Collections.unmodifiableMap(m);
System.out.println(m); // Reading is OK
//! m.put("Ralph", "Howdy!");
}
} ///:~
Calling the unmodifiable method for a particular type does not cause
compile-time checking, but once the transformation has occurred, any calls to methods that
modify the contents of a particular container will produce an UnsupportedOperationException.
.
In each case, you must fill the container with meaningful data before you make
it read-only. Once it is loaded, the best approach is to replace the existing reference
with the reference that is produced by the unmodifiable call. That way, you
dont run the risk of accidentally trying to change the contents once youve
made it unmodifiable. On the other hand, this tool also allows you to keep a modifiable
container as private within a class and to return a read-only reference to that
container from a method call. So, you can change it from within the class, but everyone
else can only read it. .
The synchronized keyword is an important part of the
subject of multithreading, a more complicated topic that
will not be introduced until Chapter 13. Here, I shall note only that the Collections
class contains a way to automatically synchronize an entire container. The syntax is
similar to the unmodifiable methods:
//: c11:Synchronization.java
// Using the Collections.synchronized methods.
import java.util.*;
public class Synchronization {
public static void main(String[] args) {
Collection c =
Collections.synchronizedCollection(new ArrayList());
List list =
Collections.synchronizedList(new ArrayList());
Set s = Collections.synchronizedSet(new HashSet());
Map m = Collections.synchronizedMap(new HashMap());
}
} ///:~
In this case, you immediately pass the new container through the appropriate
synchronized method; that way, theres no chance of accidentally exposing
the unsynchronized version. .
The Java containers also have a
mechanism to prevent more than one process from modifying the contents of a container. The
problem occurs if youre iterating through a container, and some other process steps
in and inserts, removes, or changes an object in that container. Maybe youve already
passed that object, maybe its ahead of you, maybe the size of the container shrinks
after you call size( )there are many scenarios for disaster. The Java
containers library incorporates a fail-fast mechanism that looks for any changes to
the container other than the ones your process is personally responsible for. If it
detects that someone else is modifying the container, it immediately produces a ConcurrentModificationException.
This is the fail-fast aspectit doesnt try to detect a problem
later on using a more complex algorithm. .
Its quite easy to see the fail-fast mechanism in operationall you have to
do is create an iterator and then add something to the collection that the iterator is
pointing to, like this:
//: c11:FailFast.java
// Demonstrates the "fail fast" behavior.
// {ThrowsException}
import java.util.*;
public class FailFast {
public static void main(String[] args) {
Collection c = new ArrayList();
Iterator it = c.iterator();
c.add("An object");
// Causes an exception:
String s = (String)it.next();
}
} ///:~
The exception happens because something is placed in the container after the
iterator is acquired from the container. The possibility that two parts of the program
could be modifying the same container produces an uncertain state, so the exception
notifies you that you should change your codein this case, acquire the iterator after
you have added all the elements to the container. .
Note that you cannot benefit from this kind of monitoring when youre accessing
the elements of a List using get( ). .
Its
possible to turn an array into a List with the Arrays.asList( ) method:
//: c11:Unsupported.java
// Sometimes methods defined in the
// Collection interfaces don't work!
// {ThrowsException}
import java.util.*;
public class Unsupported {
static List a = Arrays.asList(
"one two three four five six seven eight".split(" "));
static List a2 = a.subList(3, 6);
public static void main(String[] args) {
System.out.println(a);
System.out.println(a2);
System.out.println("a.contains(" + a.get(0) + ") = " +
a.contains(a.get(0)));
System.out.println("a.containsAll(a2) = " +
a.containsAll(a2));
System.out.println("a.isEmpty() = " + a.isEmpty());
System.out.println("a.indexOf(" + a.get(5) + ") = " +
a.indexOf(a.get(5)));
// Traverse backwards:
ListIterator lit = a.listIterator(a.size());
while(lit.hasPrevious())
System.out.print(lit.previous() + " ");
System.out.println();
// Set the elements to different values:
for(int i = 0; i < a.size(); i++)
a.set(i, "47");
System.out.println(a);
// Compiles, but won't run:
lit.add("X"); // Unsupported operation
a.clear(); // Unsupported
a.add("eleven"); // Unsupported
a.addAll(a2); // Unsupported
a.retainAll(a2); // Unsupported
a.remove(a.get(0)); // Unsupported
a.removeAll(a2); // Unsupported
}
} ///:~
Youll discover that only a portion of the Collection and List interfaces
are actually implemented. The rest of the methods cause the unwelcome appearance of
something called an UnsupportedOperationException. The Collection interfaceas
well as some of the other interfaces in the Java containers librarycontain
optional methods, which might or might not be supported in the
concrete class that implements that interface. Calling an unsupported method
causes an UnsupportedOperationException to indicate a programming error. .
What?!? you say, incredulous. The whole point of interfaces
and base classes is that they promise these methods will do something meaningful! This
breaks that promise; it says that not only will calling some methods not perform a
meaningful behavior, but they will stop the program! Type safety was just thrown out the
window! .
Its not quite that bad. With a Collection, List, Set, or Map,
the compiler still restricts you to calling only the methods in that interface, so
its not like Smalltalk (in which you can call any method for any object, and find
out only when you run the program whether your call does anything). In addition, most
methods that take a Collection as an argument only read from that Collectionall
the read methods of Collection are not optional. .
This approach prevents an explosion of interfaces in the design. Other designs for
container libraries always seem to end up with a confusing plethora of interfaces to
describe each of the variations on the main theme, and are thus difficult to learn.
Its not even possible to capture all of the special cases in interfaces,
because someone can always invent a new interface. The unsupported
operation approach achieves an important goal of the Java containers library: The
containers are simple to learn and use; unsupported operations are a special case that can
be learned later. For this approach to work, however: .
In the preceding example, Arrays.asList( ) produces
a List that is backed by a fixed-size array. Therefore, it makes sense that the
only supported operations are the ones that dont change the size of the array. If,
on the other hand, a new interface were required to express this different kind of
behavior (called, perhaps, FixedSizeList), it would throw open the door
to complexity, and soon you wouldnt know where to start when trying to use the
library. .
Note that you can always pass the result of Arrays.asList( ) as a
constructor argument to a List or Set in order to create a regular container
that allows the use of all the methods. .
The documentation for a method that takes a Collection, List, Set,
or Map as an argument should specify which of the optional methods must be
implemented. For example, sorting requires the set( ) and Iterator.set( )
methods, but not add( ) and remove( ). .
Unfortunately, a lot of code was written using the Java 1.0/1.1 containers, and even
new code is sometimes written using these classes. So although you should never use the
old containers when writing new code, youll still need to be aware of them. However,
the old containers were quite limited, so theres not that much to say about them.
(Since they are in the past, I will try to refrain from overemphasizing some of the
hideous design decisions.) .
The only self-expanding sequence in Java 1.0/1.1 was the Vector,
so it saw a lot of use. Its flaws are too numerous to describe here (see the first edition
of this book, available as a free download from www.BruceEckel.com). Basically, you
can think of it as an ArrayList with long, awkward method names. In the Java 2
container library, Vector was adapted so that it could fit as a Collection and
a List, so in the following example, the Collections2.fill( ) method is
successfully used. This turns out to be a bit perverse, as it may confuse some people into
thinking that Vector has gotten better, when it is actually included only to
support pre-Java 2 code. .
The Java 1.0/1.1 version of the iterator chose to invent a new name,
enumeration, instead of using a term that everyone was already familiar with.
The Enumeration interface is smaller than Iterator,
with only two methods, and it uses longer method names: boolean hasMoreElements( )
produces true if this enumeration contains more elements, and Object
nextElement( ) returns the next element of this enumeration if there are any more
(otherwise it throws an exception). .
Enumeration is only an interface, not an implementation, and even new libraries
sometimes still use the old Enumeration, which is unfortunate but generally
harmless. Even though you should always use Iterator when you can in your own code,
you must be prepared for libraries that want to hand you an Enumeration. .
In addition, you can produce an Enumeration for any Collection
by using the Collections.enumeration( ) method, as seen in this example:
//: c11:Enumerations.java
// Java 1.0/1.1 Vector and Enumeration.
import java.util.*;
import com.bruceeckel.util.*;
public class Enumerations {
public static void main(String[] args) {
Vector v = new Vector();
Collections2.fill(v, Collections2.countries, 100);
Enumeration e = v.elements();
while(e.hasMoreElements())
System.out.println(e.nextElement());
// Produce an Enumeration from a Collection:
e = Collections.enumeration(new ArrayList());
}
} ///:~
The Java 1.0/1.1 Vector has only an addElement( ) method, but fill( )
uses the add( ) method that was pasted on while Vector was being turned
into a List. To produce an Enumeration, you call elements( ),
then you can use it to perform a forward iteration. .
The last line creates an ArrayList and uses enumeration( ) to adapt
an Enumeration from the ArrayList Iterator. Thus, if you have old
code that wants an Enumeration, you can still use the new containers. .
As youve seen in the performance comparison in this chapter, the basic Hashtable is very similar to the HashMap, even down to
the method names. Theres no reason to use Hashtable instead of HashMap
in new code. .
The concept of the stack was introduced earlier, with the LinkedList. Whats
rather odd about the Java 1.0/1.1 Stack is that instead of using a Vector as a building block, Stack is
inherited from Vector. So it has all of the characteristics and behaviors of
a Vector plus some extra Stack behaviors. Its difficult to know
whether the designers consciously thought that this was an especially useful way of doing
things, or whether it was just a naïve design; in any event it was clearly not reviewed
before it was rushed into distribution, so this bad design is still hanging around
(but you should never use it). .
Heres a simple demonstration of Stack that pushes each line from a String
array:
//: c11:Stacks.java
// Demonstration of Stack Class.
import com.bruceeckel.simpletest.*;
import java.util.*;
import c08.Month;
public class Stacks {
private static Test monitor = new Test();
public static void main(String[] args) {
Stack stack = new Stack();
for(int i = 0; i < Month.month.length; i++)
stack.push(Month.month[i] + " ");
System.out.println("stack = " + stack);
// Treating a stack as a Vector:
stack.addElement("The last line");
System.out.println("element 5 = " +
stack.elementAt(5));
System.out.println("popping elements:");
while(!stack.empty())
System.out.println(stack.pop());
monitor.expect(new String[] {
"stack = [January , February , March , April , May "+
", June , July , August , September , October , " +
"November , December ]",
"element 5 = June ",
"popping elements:",
"The last line",
"December ",
"November ",
"October ",
"September ",
"August ",
"July ",
"June ",
"May ",
"April ",
"March ",
"February ",
"January "
});
}
} ///:~
Each line in the months array is inserted into the Stack with push( ),
and later fetched from the top of the stack with a pop( ). To make a point, Vector
operations are also performed on the Stack object. This is possible because, by
virtue of inheritance, a Stack is a Vector. Thus, all operations that
can be performed on a Vector can also be performed on a Stack, such as elementAt( ).
.
As mentioned earlier, you should use a LinkedList when you want stack behavior. .
A BitSet is used if you want to efficiently store a lot
of on-off information. Its efficient only from the standpoint of size; if
youre looking for efficient access, it is slightly slower than using an array of
some native type. .
In addition, the minimum size of the BitSet is that of a long: 64 bits.
This implies that if youre storing anything smaller, like 8 bits, a BitSet
will be wasteful; youre better off creating your own class, or just an array, to
hold your flags if size is an issue. .
A normal container expands as you add more elements, and the BitSet does this as
well. The following example shows how the BitSet works:
//: c11:Bits.java
// Demonstration of BitSet.
import java.util.*;
public class Bits {
public static void printBitSet(BitSet b) {
System.out.println("bits: " + b);
String bbits = new String();
for(int j = 0; j < b.size() ; j++)
bbits += (b.get(j) ? "1" : "0");
System.out.println("bit pattern: " + bbits);
}
public static void main(String[] args) {
Random rand = new Random();
// Take the LSB of nextInt():
byte bt = (byte)rand.nextInt();
BitSet bb = new BitSet();
for(int i = 7; i >= 0; i--)
if(((1 << i) & bt) != 0)
bb.set(i);
else
bb.clear(i);
System.out.println("byte value: " + bt);
printBitSet(bb);
short st = (short)rand.nextInt();
BitSet bs = new BitSet();
for(int i = 15; i >= 0; i--)
if(((1 << i) & st) != 0)
bs.set(i);
else
bs.clear(i);
System.out.println("short value: " + st);
printBitSet(bs);
int it = rand.nextInt();
BitSet bi = new BitSet();
for(int i = 31; i >= 0; i--)
if(((1 << i) & it) != 0)
bi.set(i);
else
bi.clear(i);
System.out.println("int value: " + it);
printBitSet(bi);
// Test bitsets >= 64 bits:
BitSet b127 = new BitSet();
b127.set(127);
System.out.println("set bit 127: " + b127);
BitSet b255 = new BitSet(65);
b255.set(255);
System.out.println("set bit 255: " + b255);
BitSet b1023 = new BitSet(512);
b1023.set(1023);
b1023.set(1024);
System.out.println("set bit 1023: " + b1023);
}
} ///:~
The random number generator is used to create a random byte, short, and int,
and each one is transformed into a corresponding bit pattern in a BitSet. This
works fine because a BitSet is 64 bits, so none of these cause it to increase in
size. Then a BitSet of 512 bits is created. The constructor allocates storage for
twice that number of bits. However, you can still set bit 1024 or greater. .
To review the containers provided in the standard Java library: .
The containers are tools that you can use on a day-to-day basis to make your programs
simpler, more powerful, and more effective. .
Solutions to selected exercises can be found in the electronic document The Thinking
in Java Annotated Solution Guide, available for a small fee from www.BruceEckel.com.
[51] Its
possible, however, to ask how big the vector is, and the at( ) method does
perform bounds checking.
[52] This is one of
the places where C++ is distinctly superior to Java, since C++ supports parameterized
types with the template keyword.
[53] The C++
programmer will note how much the code could be collapsed with the use of default
arguments and templates. The Python programmer will note that this entire library would be
largely unnecessary in that language.
[54] Design
Patterns, Erich Gamma et al., Addison-Wesley 1995.
[55] By Joshua Bloch
at Sun.
[56] This data was
found on the Internet, then processed by creating a Python program (see www.Python.org).
[57] This is a place
where operator overloading would be nice.
[58] If these
speedups still dont meet your performance needs, you can further accelerate table
lookup by writing your own Map and customizing it to your particular types to avoid
delays due to casting to and from Objects. To reach even higher levels of
performance, speed enthusiasts can use Donald Knuths The Art of Computer
Programming, Volume 3: Sorting and Searching, Second Edition to replace overflow
bucket lists with arrays that have two additional benefits: they can be optimized for disk
storage characteristics and they can save most of the time of creating and garbage
collecting individual records.
[59] As it turns out,
a prime number is not actually the ideal size for hash buckets, and recent hashed
implementations in Java uses a power of two size (after extensive testing). Division or
remainder is the slowest operation on a modern processor. With a power-of-two hash table
length, masking can be used instead of division. Since get( ) is by far the
most common operation, the % is a large part of the cost, and the power-of-two approach
eliminates this (but may also affect some hashCode( ) methods).
[60] In a private
message, Joshua Bloch wrote: ... I believe that we erred by allowing implementation
details (such as hash table size and load factor) into our APIs. The client should perhaps
tell us the maximum expected size of a collection, and we should take it from there.
Clients can easily do more harm than good by choosing values for these parameters. As an
extreme example, consider Vectors capacityIncrement. No one should ever set
this, and we shouldnt have provided it. If you set it to any non-zero value, the
asymptotic cost of a sequence of appends goes from linear to quadratic. In other words, it
destroys your performance. Over time, were beginning to wise up about this sort of
thing. If you look at IdentityHashMap, youll see that it has no low-level
tuning parameters.
|
|
|