In the first two articles of this series on Java 5, I explored enums and annotations, the no-brainer enhancements to Java that everyone should use. Now I am going to tackle the more challenging addition to the Java language, generics. In the aftermath of all the changes to the core language that took place in Java 5, generics has emerged as the most problematic, holding significant expressive power but carrying with it a lot of baggage. This baggage has come in the form of many compiler warnings, unexpected compiler errors, and other surprises. Yet, used correctly, generics are very powerful and can actually produce much cleaner code than was possible previously.
Introduction
The simplest explanation of generics is that they are variables to types (which are bound at compile-time). Quite often in object-oriented programming, there arises a need to represent a "generic" type reference, for which the actual type can be just about anything. A classic example of this are objects in java.util.List: a list can contain any type of object, and so its accessor methods often are represented by an Object return value. This led to the frequent need to cast the return value from these types of methods, a process prone to the dreaded ClassCastException. Generics get rid of the need to do unsafe casting--at the bytecode level, casting is still taking place, but it is guaranteed to be safe by the compiler.
Generics make code clearer. Take for example the following method signature: public List getThings(). What type is contained in the returned List? Without documentation, or code to test it, it would be difficult to tell. With a method like public List<String> getThings(), however, it is clear that the returned structure contains String objects.
Generics enforce type safety, since calling an add("Hello World!") on a List<Integer> produces a compile error. In non-generic code, this would not produce an error.
History
The origin of generics in Java came from an experimental language that extended Java, GJ (Generic Java). Inspired by templates in the C++ programming language, but with significant differences, GJ was the testing ground for the concepts that ultimately became the generics implementation in Java 5. The process of adding generics to Java was done as a Java Community Process (JCP) as the JSR-014 proposal.
How Generics Work
The key improvement in Java's implementation of generics (vs. C++ templates) is that Java generics are much more type safe, i.e. Java can enforce type relationships whereas C++ could not. This is because generic types in Java are actually compiled into the .class file rather than text-substituted by a pre-compiler. This means that Collection<A> is a different type than Collection<B>(at least at compile-time), and this difference is enforced by the compiler. But in a strange (and some would say cruel) twist of fate, this generic type information is erased at compile-time--a process referred to as type erasure--when the .java file is compiled (though generic information does seem to be kept in the class file metadata area). Type erasure was done to maintain backward compatibility with older Java code, and is the key source of controversy in the generics implementation, as it is the source of many of the problems and corner cases that generics add to the language. Allowing generic types to be kept at runtime (so-called reified generics) is one of the proposals on the table for Java 7.
The Syntax of Generics
Java allows both classes and methods to be genericized. The appropriate declarations are done using angled brackets containing a comma-separated list of type parameters, like <A,B,C> (by convention, generic variables are written as a single capitalized letter, but any valid variable name can be used). Generic declarations occur after the class name in a class declaration but before the return type in a method declaration. Here is an example using both:
public class Structure<T> {
public T get() { ... }
public <G> G getDifferent() { ... }
}
Generics can be nested, like Map<String,Collection<Object>>.
Generics can also include wildcards using the ? symbol. Why is this needed, since generic types are already variables? The answer is because generics are not covariant (unlike arrays). That is to say, Collection<Object> c = new ArrayList<String>() will not compile because c.add(new Integer(1)) is perfectly legal but obviously an Integer is not a String. The problem is solved with wildcards, since Collection<?> can be the supertype of any Collection.
Generics can specify a range of types (bounded wildcards or type variables), either subtypes
List<? extends Node> or List<T extends Node>
or supertypes
List<? super Element>
(Note: super can only be used with bounded wildcards.)
Type Tokens
Because of type erasure, it is impossible to (accurately) get the Class of a generic type at runtime. T.class will not compile and calling getClass() on a generic type returns Class<? extends Object>. But sometimes the Class is needed to correctly cast (the cast() method was added to Class in Java 5, and Class itself was genericized) or newInstance() an object of a generic type. This is where passing in the Class as a "type token" comes in. For example,
public <T> T createObject(Class<T> clazz) {
return clazz.newInstance();
}
Here you are guaranteed that the return type will be of type T, whereas in J2SE 1.4 and earlier, this was not true. In order to illustrate this point let's look at the equivalent non-generic code:
public MyType createObject(Class clazz) {
return (MyType) clazz.newInstance();
}
The cast here is not guaranteed to succeed. If the caller invoked this method using createObject(String.class), a ClassCastException would be raised.
Super Type Tokens
You cannot create type tokens for generic classes because of type erasure, i.e. List<String>.class will not compile. There is a "backdoor" solution to this problem because, actually, not all generic information is erased at runtime. The Class method called getGenericSuperclass() allows the differentiation between, say, between List<String> and List<Integer> at runtime using the "super type token" concept. It works like this:
public abstract class TypeReference<T> {
private final Type type;
protected TypeReference() {
Type superclass = getClass().getGenericSuperclass();
this.type = ((ParameterizedType) superclass)
.getActualTypeArguments()[0];
}
// equals() and hashCode() base on Type
}
A "super type token" would be created using an empty subclass of TypeReference, like
new TypeReference<List<String>>() {}This would be differentiable from a TypeReference<List<Integer>> at runtime. For a more thorough implementation, see the TypeLiteral class from Google Guice.
Annoyances
There are many corner cases and *gotchas* in generics. Here are a few of my favorites.
- Generics can make your code very verbose. The classic example of this is the declaration and initialization of a
Map:
Map<String,Collection<Number>> map = = new HashMap<String,Collection<Number>>(); - Too many generic declarations and/or overuse of bounded types can make code difficult to read.
- The static
getClass()method on any Java object returns aClass<? extends $type>, notClass<$type>(where $type is the type on which the method is called). This means, for example, the codeClass<String> c = "Hello World!".getClass()does not compile becauseClass<? extends String>is not assignable toClass<String>. This tends to force you into using type tokens, often having to addClass<T>to method signatures where you would not have done so otherwise. - Wildcards and type variables don't mix. The generic type
Collection<? extends Number>is not the same asCollection<T extends Number>, so you must be careful in how use the two, especially in method signatures. - Arrays and generics don't mix. The general advice here is to use generic Collections rather than arrays.
- Sometimes, because of generics, you will have to do an unsafe cast from one generic type to another. It will happen.
- @SuppressWarnings is, if not your best friend, at least a close companion (but try to use it only if you can prove that what you are doing is safe).
Conclusion
Generics are a powerful and, by and large, welcome feature in Java 5. Using them does improve your code. But type erasure opens up a big hole in the Java type system that is created between the bytecode compiler and the runtime compiler. This could be solved by the reification of generics, but reification breaks backward compatibility with Java classes written against pre-Java 5 compilers. A suggested solution has been to introduce optionally reified generics (using something like List<@String> syntax), but I think that having two different types of generics may traumatize the Java type system (and Java programmers) even further. A better decision might be to simply break backward compatibility at some point and let programmers decide if they need reification enough to forgo compatibility.
Links
The Generics Tutorial, Gilad Bracha.
Effective Java Reloaded (JavaOne), Joshua Bloch.
Super Type Tokens, Neal Gafter
Limitations of Super Type Tokens, Neal Gafter
Java Generics FAQ, Angelika Langer.
Delicious
Digg
StumbleUpon
Propeller
Reddit
Magnoliacom
Newsvine
Furl
Facebook
Google
Yahoo
Technorati
Icerocket
Recent comments
4 weeks 1 day ago
5 weeks 3 days ago
6 weeks 4 days ago
6 weeks 5 days ago
7 weeks 4 hours ago
7 weeks 2 days ago
7 weeks 3 days ago
9 weeks 5 days ago
10 weeks 21 hours ago
10 weeks 23 hours ago