Linq is definitely one of my favorite language features in C#. It makes it simple to express set-based business logic concisely, and if used well can be a very powerful tool in your arsenal. As with any powerful tool, however it is very easy to use it badly.
Today I want to talk about the "select one" family of methods (listed below) and how to choose the right one.
- Single()
- First()
- Last()
- The OrDefault versions of the above
Let's start with an example of bad usage that I see very often:
var age = people.FirstOrDefault(o => o.Name == name).Age;
In the above code, we find the age of the person with the given name. There are two separate problems with this line of code, both related to the use of FirstOrDefault().
First, let's look at the use of the OrDefault version of the method. The OrDefault version will return the default value for the type if no elements are found (if we assume that the person type is a class this would be null) whilst the plain versions (e.g. First()) will throw an exception.
In the code above we use FirstOrDefault() and immediately try to access a property on the object. If there were no people then this will throw a null reference exception. This is the same problem as discussed in my previous post on failing early where the source of an error is being masked making it harder to track down.
This type of possible problem applies to all the OrDefault versions of the "select one" methods.
Let's fix this by changing the method to First() as follows:
var age = people.First(o => o.Name == name).Age;
Now let's imagine that the requested name is Bob and in the system there are two people called Bob. Is it valid to pick the first one? The answer is probably no.
The question you need to ask when writing this code is "is it valid for more than one person to have the same name?"
If the answer to this question is yes then you need to work out what you do in that situation - possibly you need to select a set of ages or use more than just name to filter the set in the first place.
If the answer is no (perhaps Name here is a unique username) then you should use the Single() rather than First() version of the method. Single() will throw an exception if more than one match is found. This gives you built in data integrity checks and means that our assumption is both validated when the code is run and made clear to anyone working on the code in the future. Failing early like this will also make it easier to track down any issues that arise. If a user reports an exception thrown by the Single() call, you can immediately see there are multiple people with the same name. If you used First() then you could get an error report saying for example the system claims Bob only has 5 years to retirement but the user knows he is only 23 - now you have to track back from the part of the system calculating retirement age to this code to find the problem. You may even find that this bug is occurring but not being spotted which is even worse.
This type of problem also applies to the Last() method since like First() it will pick one element from a set of one or more.
Summary
When selecting one from a set you should ask the following two questions:
Is it valid for the set to ever be empty and if so does my calling code handle this case?
- If the answer is no then don't use OrDefault
Is it valid for the set to ever contain more than one item.
- If the answer is no then use Single().
- If the answer is yes then you need to ask whether picking the first or last element is correct in the context it is being used. If not you should not be picking a single element in the first place