Which Collection Type Should You Return or Pass?

by Larry Spencer Sunday, July 8, 2012 3:04 PM

One of the joys of software development is being as semantically precise as possible. Sometimes, several solutions will work, but only one will give that frisson of pleasure that rewards a job done perfectly.

Such is the case when choosing which type of collection to return from your method, or which one to pass in.

The common-sense principle is that you should put all the necessary constraints on both yourself and your callers, but no more. In other words, use the type whose semantics are the tightest superset of what you need.

Let’s see where that principle leads us.

 

Arrays

Arrays are very close to the metal. The Framework Design Guidelines (section 8.3.3) therefore suggest them for low-level, performance-critical APIs.

Other than that ... well, we learned about arrays in our first week of programming, so we might never have realized how odd they are. An array is an ordered collection whose members are mutable, but whose size is not. As Eric Lippert asks,  "Does that solve a problem anyone actually has?" (He has much more to say about why arrays are somewhat harmful.)

We can almost always do better. Let’s first consider whether to use interfaces or classes. 

 

Interfaces versus Classes

Every collection class in .NET inherits from at least one interface. Should you return/pass the interface or the class? IList<T> or List<T>? (I take for granted that we prefer the generic versions over their older, non-generic siblings.)

If you use a class, you cannot change to another class without breaking your callers. Why not use an interface and reserve the possibility of changing the underlying implementation later?

So let’s consider some interfaces, from the most general to the most specific.

 

IEnumerable<T>

An IEnumerable<T> lets you do just one thing: read the collection from front to back. (You can still call methods on the collection's members that will change their state, of course. If you want the objects in the collection to be immutable, you'll have to code them that way or make the IEnumerable<T> serve up clones.) It’s surprising how often a forward-only read is all you need. Most of your return collections and parameter collections can probably be IEnumerable<T>s. UPDATE: I think IReadOnlyCollection<T> is better.

Interestingly, IEnumerable<T> does not even let you know how many elements are in the collection. That’s a feature, not a bug! One use for IEnumerable<T> is to generate a series of indefinite length (e.g., a series of random words). More commonly, you might use an IEnumerable<T> to step through a set of business objects that might be a-building even while you’re reading it. in fact, the Framework Design Guidelines (section 8.3.2.) recommend that properties return "live" IEnumerable<T>s rather than "snapshots."

LINQ extends IEnumerable<T> with a Count() method, but that’s not a native feature of the interface. UPDATE: Unfortunately, this can lead to Liskov violations and/or code smell.

LINQ also offers ToList() if the dynamic nature of an IEnumerable<T> is not desired.

LINQ aside, if the only thing your method needs to do with a collection parameter is step through it in order, use IEnumerable <T>. Likewise, use an IEnumerable <T> return type if that’s all you want your callers to be able to do.

 

IReadOnlyCollection<T>

IReadOnlyCollection<T> extends IEnumerable<T> with a Count property. Unfortunately, this interface only became available in .NET 4.5. If you’re saddled with an earlier version, you’ll have to use…

 

ICollection<T>

ICollection<T> offers not only a Count property, but methods to add or remove elements.

It’s too bad that in Framework versions before 4.5 you must drag in the add/remove semantics just to get a true Count property. This is such a sorry state of affairs that the Framework Design Guidelines (section 8.3.1) suggest that if Count is your only reason for upgrading a parameter from IEnumerable<T>, you should consider sticking with IEnumerable<T> and dynamically testing to see if it is, in fact, an ICollection<T>. If it is, you can use the ICollection<T>’s Count property; otherwise, program a work-around. 

You can sort of avoid the add/remove semantics by setting the ICollection<T>’s IsReadOnly property. You do this by using the AsReadOnly<T>() method of the type: for example, List<T>.AsReadOnly() will wrap the List<T> in a read-only collection. This is ugly, though. The returned ReadOnlyCollection<T> is really an ICollection<T> whose collection-modifying methods will throw NotSupportedExceptions. Talk about a Liskov violation!

 

IList<T>

An IList<T> is an ICollection<T> that also has indexed access (the Items property, the RemoveAt method, etc.).

 

Exotic Collections

You will also want to be familiar with exotic classes like KeyedCollection<T>, ObservableCollection<T>, ReadOnlyObservableCollection<T>, and the thread-safe concurrent collections

 

Custom Classes and Interfaces

If none of the built-in collection classes or interfaces does what you need, you can always create your own. When you do, consider inheriting from one or more of the built-in interfaces. That will enable your class or interface to be used in as many contexts as possible.

 

Summary

Prefer parameters and return types in this order:

  1. IEnumerable<T>
  2. IReadOnlyCollection<T>  (.NET 4.5 and above)
  3. ICollection<T> possibly with IsReadOnly set to true
  4. IList<T>
  5. Exotic collection classes
  6. Custom classes or interfaces
  7. Arrays in rare cases only 

Tags: , , , ,

All

About the Author

Larry Spencer

Larry Spencer develops software with the Microsoft .NET Framework for ScerIS, a document-management company in Sudbury, MA.