2010-12-31

Dictionaries

Nearly every non-trivial application makes use of dictionaries, a.k.a. maps, associative arrays, hashes, lookup tables. What are some useful and convenient forms for these objects?

Types of Dictionaries

A look at the Java Collections and Concurrency APIs shows that there are quite a few kinds of dictionaries out there. Among them (note that these types are not mutually exclusive) are:

  • HashMap
  • SortedMap
  • TreeMap
  • LinkedHashMap
  • ConcurrentMap
  • NavigableMap
  • SkipListMap
  • IdentityHashMap
  • EnumMap
  • WeakHashMap

Literal Forms

If you've only coded in Java, C, Fortran, Pascal, Ada, or similar languages, you're missing out on the fun of writing dictionary literals. Here is a sampling of syntaxes:
{"CA":"Sacramento", "HI":"Honolulu", "OR": "Salem", "MA":"Boston"}
{CA:"Sacramento", HI:"Honolulu", OR: "Salem", MA:"Boston"}
{"CA" => "Sacramento", "HI" => "Honolulu", "OR" => "Salem", "MA" =>"Boston"}
{CA => "Sacramento", HI => "Honolulu", OR => "Salem", MA => "Boston"}
{:CA => "Sacramento", :HI => "Honolulu", :OR => "Salem", :MA => "Boston"}

The colon as a separator is probably the most popular; it appears in well-known data interchange formats like JSON and YAML. The => is called a hashrocket, because dictionaries are sometimes called hashes. (They shouldn't be called hashes in general; that term should only be used when you know that internally the dictionary uses a hash table for storage. Often you'll do fine with a search tree or a plain old array.)

Coping without Literals

Some languages do not allow literal forms for dictionaries. What can you do? In a dyanmically typed language, some kind of constructor taking a string with delimiters works pretty well:

def dictionary (s)
  d = Dictionary.new;
  pairs = s.split("|")
  for pair in pairs
    key, value = pair.split(",")
    d.put(key, value)
  end
  return d
end

states = dictionary("CA,Sacramento|HI,Honolulu|OR,Salem|MA,Boston");

Works for this one case, but in general the delimiters should be parameters...


Java programmers can combine static initializers with anonymous subclasses to approximate literals:

Map<String, String> states = new HashMap<String, String>() {
    {
        put("CA", "Sacramento");
        put("HI", "Honolulu");
        put("OR", "Salem");
        put("MA", "Boston");
    }
};

Testing Key Equality

TODO

Static Typing

TODO

Multimaps

TODO