Lazy LINQing


6 August 2012, by

previous article in series

This post looks at one reasonably important aspect of how LINQ works under the hood – lazy evaluation.

LINQ is lazy

By way of example, let us return to beer. Here’s a simple program to provide details of the beer we currently have available:

foreach (Beer beer in GetBeerList())
{
  Console.Out.WriteLine(
    string.Format("Beer name: {0} {1}",
      beer.Name,
      beer.IsNice ? "(nice)" : "(not nice)"));
}

 

Beer name: Old Tom's (not nice)
Beer name: Young Dan's (nice)
Beer name: Now Beer (not nice)

Now consider an earlier example which counts how many beers are nice:

public int CountNiceBeers()
{
  return GetBeerList().Where(BeerIsNice).Count();
}

private static bool BeerIsNice(Beer beer)
{
  Console.Out.WriteLine(beer.Name);
  return beer.IsNice;
}

Note that I’ve modified this example code to do something truly horrible, that you should never ever do. Except for educational purposes, of course. The BeerIsNice method has an undocumented and unexpected side-effect – it writes the name of the beer being tested to the console. So we can keep an eye on what it’s up to…

When you use the count method, you’d expect to find exactly one nice beer. And indeed you do:

Console.Out.WriteLine("Number of nice beers: " + CountNiceBeers());

 

Old Tom's
Young Dan's
Now Beer
Number of nice beers: 1

Note that the names of the beers being checked for niceness have been output while the test was being executed.

Now let’s try something slightly different. Watch closely…

public Beer GetSomeNiceBeer()
{
  return GetBeerList().Where(BeerIsNice).First();
}

Console.Out.WriteLine("The first nice beer: " + GetSomeNiceBeer().Name);

 

Old Tom's
Young Dan's
The first nice beer: Young Dan's

The code correctly picks out the first of the nice beers, as you’d expect. But note the list of beers that ran through the BeerIsNice method – the third beer was never tested for niceness. On reflection, it’s clear that testing this beer is unnecessary – we’ve already found a nice beer after the second test.

LINQ is lazy – it will only evaluate your expression as far as it needs to in order to return a result.

Do it yourself

This isn’t really a lesson in LINQ as such, but a question that might arise from perusing the above is: Querying data lazily is all very well, but can I generate data lazily? And the answer is yes. (Otherwise this section would be rather short). Of course if you use LINQ-to-SQL and fetch data from the database you would hope that Microsoft sorts this out for you, but you can also build your own lazily generated lists. The secret is the yield keyword.

The basic principle is demonstrated by the following code:

public IEnumerable<Beer> GetBeerFromDisk()
{
  while (MoreStuffOnDisk())
  {
    yield return GetNextBeerFromDisk();
  }
}

This code (with a bit of imagination) reads a list of beers from disk. But it does so in a lazy manner. So if you call GetBeerFromDisk().First(), you’ll only ever hit GetNextBeerFromDisk() once.

There are some neat things you can do with this. With thanks to this blog post by Chris Marinos, you can learn how to write code like this:

foreach(var i in 1.Through(5))
{
  Console.Out.WriteLine(i);
}

The method Through returns a list of all numbers in a range; thanks to its laziness though, it doesn’t literally create a massive list of numbers at the beginning, but returns the next number in the sequence only on demand. Take a closer look at the linked posting and learn how it works. (Note that this is also a neat example of the use of extension methods – another great tool in the making-your-code-simpler arsenal).

next article in series

Tags: , ,

Categories: Technical

«
»

Leave a Reply

* Mandatory fields


7 × = sixty three

Submit Comment