Friday, June 19, 2009

Commonality that is worth abstractions

Unfortunately desire to achieve reusability not always leads to expected results. It can bring you development effort savings through reusing existent components or it can lead to negative consequences (such as proliferation of unnecessary classes, decreased discoverability and readability).

Reusability is often based on abstractions. Abstraction captures some commonality between its concrete implementations. But abstractions being very powerful are dangerous as well.

Consider situation where a piece of work is generalized and represented through Command pattern. Basically task (commonality) resulted in command (abstraction).

interface ICommand
{
  // Performs command's task
  void Do(object parameter);
}

Is this commonality worth the abstraction? It depends on the rationale behind.

It may result from desire to factor out functionality that belongs to a particular category (for example, business rules) and through implementing common interface outline membership in a category (and of course it may be expected that through abstraction they will be reusable in some other scenarios). However with this rationale at hand it is most likely that code that creates command also executes it (because it is the only scenario for now – more other scenarios are coming :-)).

// Holds arguments for SomeCommand
class Arguments
{
  public string Arg1 { get; set; }
  public int Arg2 { get; set; }
}

class SomeCommand : ICommand
{
  public void Do(object parameter)
  {
      var args = (Arguments) parameter;
      // Do something with arguments
  }
}
var command = new SomeCommand();
var args = new Arguments {Arg1 = "value", Arg2 = 10};
command.Do(args);

Unfortunately this rationale doesn’t justify introduction of command abstraction. It forces calling code to consume general purpose opaque contract despite the fact that calling code knows exactly what must be done. It decreases readability in comparison with direct call to a method with comprehensive name and may result in proliferation of unnecessary classes which are basically arguments containers. Thus the abstraction introduced drawbacks with no benefits.

It is as important to resist premature abstractions as the abstraction itself. Every abstraction even not used once is not free - it has maintenance costs attached since abstractions are considered to be a contract between the author and consumers that must be fulfiled.

Anytime you introduce a new abstraction a concrete implementation of the abstraction and consuming API must be provided as well. It is hard to see whether abstraction designed correctly unless there is a use case implemented that uses the abstraction.

Consider when calling code cannot execute commands itself but rather commands must executed in order of appearance (through a queue).

// Executes commands in order
interface CommandQueue
{
  // Enqueues command with parameter that is known
  // at the time of enqueue
  void Enqueue(ICommand command, object parameter);

  // Enqueues command with function that will provide
  // parameter on demand
  void Enqueue(ICommand command, Func<object> parameter);
}

In this case abstraction introduction is justified. The command queue conforms to Open/Closed Principle - it is easily extensible with new types of commands because it relies on Command abstraction.

If classes share only a common interface (not behavior), then they should inherit from a common base class or interface only if they will be used polymorphically.

Summary:

  • AVOID premature abstractions
  • DO carefully justify introduction of new abstractions
  • DO provide concrete implementation and consuming API for every abstraction introduced

Thursday, June 11, 2009

Events and Callbacks vs. Explicit Calls

From API design perspective interaction between components can be expressed in request/reply (explicit calls) or in event driven manner (events and callbacks). Consider BackgroundWorker usage example:

// BackgroundWorker executes an operation on a separate thread.
// The operation is basically an event handler of DoWork event
var worker = new BackgroundWorker();
// Once worker will be run this event will be raised in separate thread
worker.DoWork += (sender, e) =>
                {
                   // Event handler does the actual work
                   // The event is raised in a separate thread
                };
worker.RunWorkerCompleted += (sender, e) =>
                            {
                               // Performs post processing step
                               // The event is raised through sycnhronization
                               // context
                            };
// Runs worker
worker.RunWorkerAsync();

Why .NET Framework designers selected event based interaction model instead of request/reply (see hypothetical example below)?

// Represents unit of background work
internal interface IBackgroundWork
{
    // Performs actual work and will be called
    // in a separate thread
    object Do();
    // Performs post processing step and will be
    // called through sycnhronization context
    void Complete(object result);
}

internal class SomeBackgroundWork : IBackgroundWork
{
    public object Do()
    {
        // Do the actual work and return result
        return ...;
    }

    public void Complete(object result)
    {
        // Do post processing
    }
}

internal class BackgroundWorker
{
    // Runs work unit asynchronously
    public void RunWorkerAsync(IBackgroundWork work)
    {
        // elided
    }
}
var worker = new BackgroundWorker();
worker.RunWorkerAsync(new SomeBackgroundWork());

The short answer is that it is about the relationship between participants (worker and provider) and in particular the direction of the dependency. Dependency direction is identified by the domain that holds participants.

Consider IComparable<T> interface. This interface is implemented by types whose instances can be ordered. Implementing this interface on a type states that it supports comparison. On the other hand GenericComparer<T> uses IComparable<T> to compare instances.

[Serializable]
internal class GenericComparer<T> : Comparer<T>
    where T : IComparable<T>
{
    public override int Compare(T x, T y)
    {
        if (x != null)
        {
            if (y != null)
            {
                return x.CompareTo(y);
            }
            return 1;
        }
        if (y != null)
        {
            return -1;
        }
        return 0;
    }

    // othe members elided
}

Explicit calls clearly express the relationship and the consequences of the action but it limits extensibility to compile time and puts more work on API consumers because it requires them to create new types. In the "explicit" background worker example consumer will need to create new type for its unit of work. And it is highly likely that it won't be used in other places.

Consider ThreadPool class. It provides ability to execute arbitrary code on a separate thread. ThreadPool is not interested to know its consumers but it is the consumers who depend on ThreadPool to perform a task. It uses callbacks mechanism to execute arbitrary code within thread pool.

ThreadPool.QueueUserWorkItem(state =>
                            {
                             // Does actual work
                            });

Events and callbacks do not create explicit dependency and thus allows for runtime customization. Events has great tool support. It puts less work on consumers because it doesn’t require to create new types. But these benefits come at a price. It is hard to understand what happens just from looking at the code - in order to understand it all subscriptions to events must be seen. The event triggering may result in an events cascade where handling code raises next event thus creating a chain of events. From performance perspective events and callbacks are more heavyweight than explicit calls.

Summary:

  • CONSIDER using explicit calls to express the relationship (consuming several events or callbacks to perform single task may be a sign that there is role to discover)
  • CONSIDER using explicit calls if there is a potential to reuse called code in other use cases
  • CONSIDER using explicit calls to increase testability
  • AVOID expressing complex interaction using events or callbacks which will be really hard to understand
  • CONSIDER using events and callbacks to allow runtime customization and avoid undesired dependency

Friday, June 5, 2009

When member overload backfires

I do believe that quality of software design isn’t based solely on appropriate high level decisions. It is detailed quality decisions made on a smaller scale that make it possible.

The API influences consuming code. Making it obvious and simple helps to avoid creating hard to read and write error-prone code.

Recently I caught myslef writing the API presented below (I fixed it, honestly :-)):

// Represents basic document template - named placeholders can be mapped to data and
// resulting document is generated
public class DocumentTemplate
{
       // Maps named placeholder to data item
       public void Map<T>(string placeholderName, T data)
       {
              // Placeholder will be replaced with textual representation of the data
       }

       // Maps named placeholder to collection of data items - this is a repeatable placeholder
       public void Map<T>(string placeholderName, IEnumerable<T> data)
       {
              // Placeholder is treated as a repeatable piece
              // Data collection will be enumerated and for each item placeholder body will
              // duplicated and replaced with textual representation of the item
       }

       // ... Other non-relevant members are omitted
}

Consider code snippet below – it uses type inference. Can you tell which overload will be called?

TemplateMapping templateMapping = ...
var list = new List<int> {1, 2};
templateMapping.Map("v1", list); // #1
templateMapping.Map("v2", list.AsEnumerable()); // #2

It does require you to know overload resolution rules to figure out which overload will be called. In #1 case Map<List<int>>(…) will be called and Map<int>(…) in #2.

But worse it can lead to hidden bugs. Suppose we have a method like this

static IEnumerable<int> GetCollection()
{
       return new List<int> { 1, 2 };
}

And it is used with buggy-prone templating API

TemplateMapping templateMapping = ...
templateMapping.Map("v", GetCollection());

In this case Map<int>(…) is called. But suppose that the author of the GetCollection() method decided to change return type to IList<int> - other overload will be called!

It should be clear from looking at the calling code which overloaded member will be called.

In order to solve this you can for example specify manually which overload must be used

templateMapping.Map<int>("v", GetCollection());

But wait. This API has a more fundamental problem. Overloaded members are meant to be semantically equivalent. But in this particular case mapping semantic changes based on parameters types! Latter example shows case where your application may behave differently.

This problem can be solved through giving each semantically different member unique descriptive name. For example,

// Maps named placeholder to data item
public void MapSingle<T>(string placeholderName, T data)

// Maps named placeholder to collection of data items - this is a repeatable placeholder
public void MapEnumerable<T>(string placeholderName, IEnumerable<T> data)

This will also allow to avoid overload resolution problems mentioned above.

Summary:

  • AVOID making overloads that are semantically different
  • DO NOT require from developers deep knowledge of overload resolution rules in order to use your API