Share

From this page you can share Making the Most of Java 5: Generics to a social bookmarking site or email a link to the page.
Social WebE-mail
Enter multiple addresses on separate lines or separate them with commas.
Making the Most of Java 5: Generics
(Your Name) has forwarded a page to you from Ajaxonomy
(Your Name) thought you would like to see this page from the Ajaxonomy web site.

Making the Most of Java 5: Generics

Tagged:  

In the first two articles of this series on Java 5, I explored enums and annotations, the no-brainer enhancements to Java that everyone should use. Now I am going to tackle the more challenging addition to the Java language, generics. In the aftermath of all the changes to the core language that took place in Java 5, generics has emerged as the most problematic, holding significant expressive power but carrying with it a lot of baggage. This baggage has come in the form of many compiler warnings, unexpected compiler errors, and other surprises. Yet, used correctly, generics are very powerful and can actually produce much cleaner code than was possible previously.

Introduction

The simplest explanation of generics is that they are variables to types (which are bound at compile-time). Quite often in object-oriented programming, there arises a need to represent a "generic" type reference, for which the actual type can be just about anything. A classic example of this are objects in java.util.List: a list can contain any type of object, and so its accessor methods often are represented by an Object return value. This led to the frequent need to cast the return value from these types of methods, a process prone to the dreaded ClassCastException. Generics get rid of the need to do unsafe casting--at the bytecode level, casting is still taking place, but it is guaranteed to be safe by the compiler.

Generics make code clearer. Take for example the following method signature: public List getThings(). What type is contained in the returned List? Without documentation, or code to test it, it would be difficult to tell. With a method like public List<String> getThings(), however, it is clear that the returned structure contains String objects.

Generics enforce type safety, since calling an add("Hello World!") on a List<Integer> produces a compile error. In non-generic code, this would not produce an error.

History

The origin of generics in Java came from an experimental language that extended Java, GJ (Generic Java). Inspired by templates in the C++ programming language, but with significant differences, GJ was the testing ground for the concepts that ultimately became the generics implementation in Java 5. The process of adding generics to Java was done as a Java Community Process (JCP) as the JSR-014 proposal.

How Generics Work

The key improvement in Java's implementation of generics (vs. C++ templates) is that Java generics are much more type safe, i.e. Java can enforce type relationships whereas C++ could not. This is because generic types in Java are actually compiled into the .class file rather than text-substituted by a pre-compiler. This means that Collection<A> is a different type than Collection<B>(at least at compile-time), and this difference is enforced by the compiler. But in a strange (and some would say cruel) twist of fate, this generic type information is erased at compile-time--a process referred to as type erasure--when the .java file is compiled (though generic information does seem to be kept in the class file metadata area). Type erasure was done to maintain backward compatibility with older Java code, and is the key source of controversy in the generics implementation, as it is the source of many of the problems and corner cases that generics add to the language. Allowing generic types to be kept at runtime (so-called reified generics) is one of the proposals on the table for Java 7.

The Syntax of Generics

Java allows both classes and methods to be genericized. The appropriate declarations are done using angled brackets containing a comma-separated list of type parameters, like <A,B,C> (by convention, generic variables are written as a single capitalized letter, but any valid variable name can be used). Generic declarations occur after the class name in a class declaration but before the return type in a method declaration. Here is an example using both:

public class Structure<T> {
     public T get() { ... }
     public <G> G getDifferent() { ... }
}

Generics can be nested, like Map<String,Collection<Object>>.

Generics can also include wildcards using the ? symbol. Why is this needed, since generic types are already variables? The answer is because generics are not covariant (unlike arrays). That is to say, Collection<Object> c = new ArrayList<String>() will not compile because c.add(new Integer(1)) is perfectly legal but obviously an Integer is not a String. The problem is solved with wildcards, since Collection<?> can be the supertype of any Collection.

Generics can specify a range of types (bounded wildcards or type variables), either subtypes

     List<? extends Node> or List<T extends Node>

or supertypes

     List<? super Element>

(Note: super can only be used with bounded wildcards.)

Type Tokens

Because of type erasure, it is impossible to (accurately) get the Class of a generic type at runtime. T.class will not compile and calling getClass() on a generic type returns Class<? extends Object>. But sometimes the Class is needed to correctly cast (the cast() method was added to Class in Java 5, and Class itself was genericized) or newInstance() an object of a generic type. This is where passing in the Class as a "type token" comes in. For example,

public <T> T createObject(Class<T> clazz) {
     return clazz.newInstance();
}

Here you are guaranteed that the return type will be of type T, whereas in J2SE 1.4 and earlier, this was not true. In order to illustrate this point let's look at the equivalent non-generic code:

public MyType createObject(Class clazz) {
     return (MyType) clazz.newInstance();
}

The cast here is not guaranteed to succeed. If the caller invoked this method using createObject(String.class), a ClassCastException would be raised.

Super Type Tokens

You cannot create type tokens for generic classes because of type erasure, i.e. List<String>.class will not compile. There is a "backdoor" solution to this problem because, actually, not all generic information is erased at runtime. The Class method called getGenericSuperclass() allows the differentiation between, say, between List<String> and List<Integer> at runtime using the "super type token" concept. It works like this:

public abstract class TypeReference<T> {
     private final Type type;
     protected TypeReference() {
          Type superclass = getClass().getGenericSuperclass();
          this.type = ((ParameterizedType) superclass)
             .getActualTypeArguments()[0];
     }
     // equals() and hashCode() base on Type
}

A "super type token" would be created using an empty subclass of TypeReference, like

new TypeReference<List<String>>() {}

This would be differentiable from a TypeReference<List<Integer>> at runtime. For a more thorough implementation, see the TypeLiteral class from Google Guice.

Annoyances

There are many corner cases and *gotchas* in generics. Here are a few of my favorites.

  • Generics can make your code very verbose. The classic example of this is the declaration and initialization of a Map:
    Map<String,Collection<Number>> map =
        = new HashMap<String,Collection<Number>>();
  • Too many generic declarations and/or overuse of bounded types can make code difficult to read.
  • The static getClass() method on any Java object returns a Class<? extends $type>, not Class<$type> (where $type is the type on which the method is called). This means, for example, the code Class<String> c = "Hello World!".getClass() does not compile because Class<? extends String> is not assignable to Class<String>. This tends to force you into using type tokens, often having to add Class<T> to method signatures where you would not have done so otherwise.
  • Wildcards and type variables don't mix. The generic type Collection<? extends Number> is not the same as Collection<T extends Number>, so you must be careful in how use the two, especially in method signatures.
  • Arrays and generics don't mix. The general advice here is to use generic Collections rather than arrays.
  • Sometimes, because of generics, you will have to do an unsafe cast from one generic type to another. It will happen.
  • @SuppressWarnings is, if not your best friend, at least a close companion (but try to use it only if you can prove that what you are doing is safe).

Conclusion

Generics are a powerful and, by and large, welcome feature in Java 5. Using them does improve your code. But type erasure opens up a big hole in the Java type system that is created between the bytecode compiler and the runtime compiler. This could be solved by the reification of generics, but reification breaks backward compatibility with Java classes written against pre-Java 5 compilers. A suggested solution has been to introduce optionally reified generics (using something like List<@String> syntax), but I think that having two different types of generics may traumatize the Java type system (and Java programmers) even further. A better decision might be to simply break backward compatibility at some point and let programmers decide if they need reification enough to forgo compatibility.

Links

The Generics Tutorial, Gilad Bracha.
Effective Java Reloaded (JavaOne), Joshua Bloch.
Super Type Tokens, Neal Gafter
Limitations of Super Type Tokens, Neal Gafter
Java Generics FAQ, Angelika Langer.