Monday, January 18, 2010

Typesafe maps for arbitrary content

Data containers like Vectors and Hashtable are part of Java from the very early beginning. The problem is that they are bags for arbitrary data, means the type of data they deal with, is the overall super class Object. That's no problem if you store values, but on retrieval you need to now the type you put in and cast it back to the original type. Then Generics had been introduced, and all collection classes were refactored to be usable in a type-safe manner. You were now able to restrict the collection to one (or, in the case of maps two) types, and the number of cast reduced dramatically.

But sometimes you need a container for arbitrary data. Typical usage scenarios for this are context- or configuration-data. An example for that would be OSGi's BundleContext.registerService() method, which uses an untyped Dictionary to provide config data. So the one who provides, and the other who consumes the data, need to know about the type of each date. This is usually some kind of contract: Parameter host has type String, port is an Integer. A neat API will at least provide constants for the keys, like e.g.
public static final String PARAM_HOST = "host";
public static final String PARAM_PORT = "port";
So providing and consuming a value would look like something like that:
Dictionary properties = new Properties();
properties.put(PARAM_HOST, "www.blogger.com");
...
String host = (String)properties.get(PARAM_HOST);
But if we know that each key is associated with a certain type, why not let the key provide that type? With the help of Generics we can define a typed Key-class:
public class TypedKey <T> {} 
Now we are able to define typed keys:
public static TypedKey<String> PARAM_HOST = new TypedKey<String>();
public static TypedKey<Integer> PARAM_PORT = new TypedKey<Integer>();
What's left is a kind of container that deals with this kind of key. Since there's no need to create a full-blown map, we'll create some tiny class called TypedData. What would getValue() look like? Well, what we'd like to have is that the return type is specified by the type of the key:
public class TypedData {
...
public <T> T getValue(TypedKey<T> key) { ... }
It's the same thing for setValue(). The type of the value to set is given by the key:
    public <T> void setValue(TypedKey<T> key, T value) { ... }
Here comes the implementation:
public class TypedData {
private Map<TypedKey<?>, Object> map = new HashMap<TypedKey<?>, Object>();

public <T> void setValue(TypedKey<T> key, T value) {
map.put(key, value);
}

@SuppressWarnings("unchecked")
public <T> T getValue(TypedKey<T> key) {
return (T)map.get(key);
}
}
Since we want to store multiple types, the Map used must accept any kind of TypedKey and Object as values. The implementation of the setValue() is easy. In getValue() we have to cast the value (that's why we need to suppress the unckecked warning), but it's safe to do so, since our implementation of setValue() guarantees that the type always matches the one associated with the key. Now let's try it:
TypedData data = new TypedData();
data.setValue(PARAM_HOST, "www.blogger.com");
data.setValue(PARAM_PORT, 80);
...
String value = data.getValue(PARAM_HOST);
Feels good, eh? No cast necessary as if the map was fully typed. If we accidently use the wrong type, the compiler complains:



And what's even more useful is that your IDE now can deduce the correct type and give you content assist:



So, it looks good, it feels good. And it is type-safe! Sure about that? You better doubt it, there is still a weakness we have to eliminate. Remember: we made a cast in getValue() which we considered to be safe, since the type is associated with the key. We rely on the fact that two keys with different types retrieve different values, with other words: that keys are unique. But what if not? Hash keys are based on there hashCode() and equals() method, now let's make a naive implementation:
public class TypedKey<T> {

@Override
public boolean equals(Object obj) {
return true;
}

@Override
public int hashCode() {
return 3;
}
}
Ridicolous, I know. But until now, nothing prevents us from inheriting from TypedKey and do something funny like that. So let's give it a try:
TypedData data = new TypedData();
data.setValue(PARAM_HOST, "www.blogger.com");
data.setValue(PARAM_PORT, 80);
String value = data.getValue(PARAM_HOST);
What will now happen if we run that code? Tada:
Exception in thread "main" java.lang.ClassCastException:
java.lang.Integer cannot be cast to java.lang.String
Ooops. So we have to make sure that every key is unique. The implementations of Object.hashCode() and equals() are based on object-identy, that's quite perfect for us. So let's make sure nobody can tweak that by making them final:
public class TypedKey<T> {

@Override
final public int hashCode() {
return super.hashCode();
}

@Override
final public boolean equals(Object obj) {
return super.equals(obj);
}
}
Now we are done :-)

That's it for today
Ralf


P.S. Some remarks on the implementation:
My first idea was to use an interface for the key definition and enums (which implement that interface) for key constants. I like enums as constants since they are lightweight and type-safe. Alas, you can't use generics with enums. Since the sole reason for the interface was that you cannot inherit from enums, I dropped the interface and used a class instead.