Simplify with LINQ


30 July 2012, by

Using LINQ to simplify your code

previous article in series

LINQ is all about manipulating data. But actually, so is a remarkably large amount of programming. We’ve already looked at some things you can do with your list of beers using LINQ, but actually if you go back to the example I used to introduce generics, that’s all about manipulating data too:

public List<Beer> GetBeerList()
{
  var availableBeers = new List<Beer>();

  foreach (Pub pub in KentishTown)
  {
    foreach (Beer beer in pub.Beers)
    {
      if (beer.IsAvailable)
      {
        availableBeers.Add(beer);
      }
    }
  }

  return availableBeers;
}

This boils down to being a data query on the KentishTown object. LINQ allows you to make massive simplifications to this code:

public IEnumerable<Beer> GetBeerList()
{
  return KentishTown.
    SelectMany(pub => pub.Beers).
    Where(beer => beer.IsAvailable);
}

This is considerably shorter. It’s also, importantly, no less clear than the original. The only thing that may not be immediately obvious to the beginner is that SelectMany is a bit like Select, but each input object is mapped to a list of output objects; then all those lists are concatenated together. This is best explained by comparing the two examples, and trusting me that they do the same thing.

LINQ can really do a great job of simplifying your code. Huge long for-loops can suddenly collapse into a single line of code. All that logic that builds a Hashtable and then loops through all its values is replaced by a GroupBy and a Select. It’s like magic.

An aside: More Magic

Users of the excellent ReSharper may note that ReSharper will actually put squiggly lines under the foreach loops in my original example, and offer to turn them into a LINQ expression for you. It doesn’t actually do a very pretty job of this, but it could be useful on occasion. Here’s Resharper’s attempt:

public List<Beer> GetBeerListByResharper()
{
  return (from pub in this.KentishTown from beer in pub.Beers where beer.IsAvailable select beer).ToList();
}

LINQ as a Functional Programming Language

Part of the reason LINQ can make all these great simplifications to your code is that it allows and encourages you to carry out your programming tasks in a functional way. Rather than worrying about how your program will do what it’s supposed to do (loop over an array, and loop over a nested array, and then make another array of all the things you found), you focus on what your program does (take all the beers from all the pubs in Kentish Town, and pick out the available ones).

So the key to using LINQ to make your programs shorter and clearer is to think functional. That means:

  • Think about your code in terms of its inputs and outputs. Name your methods accordingly.
  • Avoid your methods having side-effects – if you feed in X, and get back Y, you should always get back Y, and nothing should happen other than returning Y.
  • Think about your code in terms of transformations on data, not in terms of the step-by-step operations you need in order to achieve the result.

This is possibly best illustrated by another attempt to refactor the GetBeerList method, as follows:

public IEnumerable<Beer> GetBeerListImperatively()
{
  var availableBeers = new List<Beer>();
  KentishTown.ToList().ForEach(pub => availableBeers.AddRange(pub.Beers));
  return availableBeers;
}

This is a bit ugly. A big clue to why is that ToList – the KentishTown object is an IEnumerable<Pub>, but we have to call ToList to get hold of the ForEach method which only exists on Lists. ForEach isn’t part of LINQ. In fact, there’s pretty much no LINQ in this code sample at all. So although it looks like it uses very much the same approach as the LINQ version earlier, it’s actually rather different. At heart, this is imperative rather than functional code. You’re telling the compiler what you want to do step-by-step, rather than trying to express yourself as a transformation on a set of data.

So, try to think functionally whenever possible and you’ll probably end up with something that looks much nicer as a result.

If the functional approach sounds intriguing then I found an interesting, and not unduly long, discussion of some things that functional programming brings to programming in general which is worth a read.

Is this all a good idea anyway?

Someone asked me the other day, when I was showing off some nifty LINQ tricks to simplify their code, whether what I was doing was actually “a good thing”. This is a good question. It’s important that you don’t go overboard with applying the latest neat tricks you’ve learnt, if it doesn’t actually benefit the code.

I believe there are two key attributes of good code:

  • It should do what it’s supposed to do
  • It should be maintainable

LINQ doesn’t really change what you can actually do with your program. It doesn’t give you any new powers. Your travelling salesman will still be just as efficient as he ever was. But LINQ can still help with both these bullets. The key point is to make sure that your code is clear. It should be absolutely obvious what you’re trying to do.

Look back at the two code snippets at the top of this post. The second example is clearer because it’s shorter. Clarity is a good thing. It obviously makes your code more maintainable, because you and others can read and understand it in the future; but it also makes it more likely that the code does the right thing now, because you and your peer reviewer can understand it.

Remember that not everything that’s shorter is clearer. The GetBeerListImperatively example above is short, but rather harder to fathom. Perl regular expressions are lovely, but if you’re not careful (or perhaps, even if you are careful) you end up with something truly horrible.

But used with care, simplifying and thus clarifying your code is a Good Thing.

Enjoy your LINQ responsibly.

next article in series

Tags: , ,

Categories: Technical

«
»

Leave a Reply

* Mandatory fields


− two = 4

Submit Comment