Generic collections in C#


16 July 2012, by

Over the next few weeks I’ll be publishing a short series of posts introducing Language Integrated Query (LINQ). However to use LINQ well it’s important to have a basic understanding of Generics, so my first post covers this topic. For nostalgia value, I’ve included a little bit of history from the heady days of .NET 1.1, and some advice for anyone who’s coming back to .NET programming having not used it since those days.

What are generics?

Essentially, generics allow you to create not a single type, but a whole class of parametrised types. I’ve struggled to find a good general explanation of this principle – if you know of one please let me know in the comments below! – but this Wikipedia article on Generic Programming does cover the ground quite well, if in rather too much detail.

The best way to understand generics is to look at their use in the archetypical case – generic collections.

An ArrayList, and some beer

The simple example of generics in action is a List. In the Bad Old Days of .NET 1.1, you might have had some code like this:

public ArrayList GetBeerList()
{
  ArrayList availableBeers = new ArrayList();

  foreach (Pub pub in KentishTown)
  {
    foreach (Beer beer in pub.Beers)
    {
      if (beer.IsAvailable)
      {
        availableBeers.Add(beer);
      }
    }
  }

  return availableBeers;
}

public void TasteABeer()
{
  ArrayList beers = GetBeerList();
  Beer beer = (Beer)beers[0];
  beer.Taste();
}

This code works perfectly well, but note that the TasteABeer method has no way of really knowing that beers contains Beer objects rather than anything else. This is a bad thing. Back in the day I used to get round this by returning

return (Beer[])availableBeers.ToArray(typeof(Beer));

But that’s not really much better – the GetBeerList method still uses the evil ArrayList, and moreover you’ve ended up with a fixed-length list of beers that you can’t add entries to when you move on from Kentish Town to Camden.

Hopefully no-one actually uses ArrayLists any more. If you do, you really want to read this post carefully! Or, you’re responsible for working with legacy code, in which case I wish you the best of luck :-)

Enter the generic List

Fortunately since .NET (since version 2.0) supports Generics, you can replace the ArrayList in the above example with a List<Beer>. This is much the same thing – it’s still a variable-length list that we can add things to or remove things from – but there’s a compile-time constraint that the only things you can put in the list are Beer objects. So you achieve the following similar but safer code:

public List<Beer> GetBeerList()
{
  var availableBeers = new List<Beer>();

  foreach (Pub pub in KentishTown)
  {
    foreach (Beer beer in pub.Beers)
    {
      if (beer.IsAvailable)
      {
        availableBeers.Add(beer);
      }
    }
  }

  return availableBeers;
}

public void TasteABeer()
{
  List<Beer> beers = GetBeerList();
  beers[0].Taste();
}

So in the more general case, a List<T> is a list that can only contain objects of type T. You can imagine that every time you use a List<SomethingElse> in your code, the compiler copies and pastes the List class and edits it to take and return SomethingElse objects instead of Ts.

Doing the same for Hashtables

There is of course also a generic equivalent of Hashtable (in fact, it nicely covers ListDictionary and HybridDictionary too – no more worrying about which to use, because Microsoft have chosen for you): Dictionary<TKey, TValue>. So a lookup table mapping pubs to types of beer available would be Dictionary<Pub, List<Beer>>:

public Dictionary<Pub, List<Beer>> GetBeersByPub()
{
  var mapping = new Dictionary<Pub, List<Beer>>();

  foreach (Pub pub in KentishTown)
  {
    mapping.Add(pub, new List<Beer>(pub.Beers));
  }

  return mapping;
}

(Note: You might notice that there are actually three non-generic hashtable-like classes – Hashtable, ListDictionary and HybridDictionary. The first two have different performance characteristics, while the third automatically switches between using the first two depending on how much data is being stored. Evidently when Microsoft built the generic Dictionary class, they didn’t feel the need to complicate matters and have made the choice of data structure implementation for us – there’s only one Dictionary)

Collection interfaces

Before we go on to look at how to use LINQ to help taste all this beer, one final note. Although List<T> is good, it’s a bit specific. If you’re defining a beer interface, you don’t want to constrain yourself by committing to use a standard list for all time – instead, it’s better to use the interface IList<T>.

In fact in this case you should probably go further than that. Best practice is to consider what you really mean by your choice of interface – and in this case you don’t particularly need “a list of beers” (i.e. something that the caller can add or remove beers from), you just want to return “some beers”. The most basic interface for representing “some stuff” is IEnumerable<T>. I would always recommend using this unless you actually need some more specific functionality – it makes the intent of your code clearer.

Similarly, you probably want to use IDictionary<TKey, TValue> rather than Dictionary<TKey, TValue>. So our GetBeersByPub method should ideally return a IDictionary<Pub, IEnumerable<Beer>> unless you actually have a particular reason for wanting the caller to know that you’re using a dictionary and a list.

LINQ-to-Objects is basically LINQ-to-IEnumerable<T>. Which is why all the above is relevant, and important background to the rest of this series on LINQ…

next article in series

Tags: , ,

Categories: Technical

«
»

Leave a Reply

* Mandatory fields


× one = 5

Submit Comment