Tuesday, December 29, 2009

Disposing sequence of resources with Reactive Extensions

Recall my previous post on Disposing sequence of resources where we were solving imperatively the following problems:

  • Create single disposable representation for a sequence of disposable resources
  • Defer resources allocation to avoid exception propagation until cleanup can be guaranteed and avoid unnecessary (allocated resources that aren’t used) allocations
  • Dispose only consumed resources and preserve nested try{}finally{} blocks semantics (resources that are allocated first from sequence order perspective disposed last; any exception thrown from a finally block hides exception being propagated)

Now with Reactive Extensions for .NET (Rx) is out we will do it in more LINQ-ish manner with the help of interactive features of Reactive Extensions:

  • EnumerableEx.Publish – publishes the values of source to each use of the bound parameter.
  • EnumerableEx.Share - shares cursor of all enumerators to the sequence.
  • EnumerableEx.Finally – invokes finallyAction after source sequence terminates normally or by an exception.

A great explanation of how EnumerableEx.Publish works (disregard naming difference) is given in Taming Your Sequence’s Side-Effects Through IEnumerable.Let. The following example illustrates the point.

static Random m_seeder = new Random();

static IEnumerable<int> GenerateRandomSequence(int count)
{
  var rnd = new Random(m_seeder.Next());
  for(int i = 0; i < count; i++)
  {
    yield return rnd.Next(count);
  }
}

...

const int count = 5;
var xs = GenerateRandomSequence(count);
// Each we iterate xs we may get a different sequence 
var equals = xs.SequenceEqual(xs);
// However it can be solved through memoization
xs.Publish(seq =>
             {
               // Every time we iterate seq we get the same 
               // sequence
               equals = seq.SequenceEqual(seq);
               return seq;
             });

EnumerableEx.Share makes sure that any iteration is made with respect to the same cursor.

var xs = Enumerable.Range(0, count);
// Prints every sequence element to console
// Without sharing for each of count iterations it will print 
// first element of a potentially different sequence (recall 
// random sequence example) 
var shared = xs.Share(); 
for(int i = 0; i < count; i++)
{
  shared.Take(1).Run(Console.WriteLine);
}

EnumerableEx.Finally does exactly what its description says (see more details here).

static IEnumerable<int> GenerateThrowingSequence(int count)
{
  for(int i = 0; i < count; i++)
  {
    if (i > 0 && i % 3 == 0)
    {
      throw new Exception();
    }
    yield return i;
  }
}

...

// Prints 0, 1, 2, Finally, Caught
try
{
  GenerateThrowingSequence(count).Finally(() => Console.WriteLine("Finally"))
    .Run(Console.WriteLine);
}
catch (Exception)
{
  Console.WriteLine("Caught");
}

// Prints 0, 1, Finally
GenerateThrowingSequence(count).Finally(() => Console.WriteLine("Finally"))
  .Take(2).Run(Console.WriteLine);

Now putting everything together. Publish will help us to defer resources allocation and avoid unnecessary allocations. Share and Finally will take care of disposal.

static class Disposables
{
  // Disposes projected resources once they are no longer needed
  public static void Using<TSource, TResource>(
    // Source sequence projected to disposable resources
    this IEnumerable<TSource> source,
    // Resource projection
    Func<TSource, TResource> resourceSelector,
    // Resources usage action
    Action<IEnumerable<TResource>> resourcesUsage)
      where TResource : IDisposable
  {
    var rcount = 0;
    source
      // At this point resources are not created but 
      // only projection is set
      .Select(
      s =>
        {
          // As we do not want to unnecessarily create 
          // and then immediately dispose potentially expensive
          // resources we will count created resources
          // and later dispose only used ones
          rcount++;
          return resourceSelector(s);
        })
      .Publish(
      rs =>
        {
          // During sequence iteration resources will be created
          // However not all resources may be iterated through or 
          // an exception may be thrown in the middle and thus 
          // not all resources may be created (therefore not 
          // disposed)
          try
          {
            // Supplied resources sequence can be iterated 
            // multiple times with each of side effects occurs 
            // only once and sequence elements memoized and 
            // reused during each iteration
            resourcesUsage(rs);
            return Enumerable.Empty<TResource>();
          }
          finally
          {
            // We must dispose only those resources we used
            // (counted and memoized above during first 
            // iteration)
            rs = rs.Take(rcount)
              // Disposing resources must be done in the opposite 
              // order to preserve nested try{}finally{} blocks 
              // semantics
              .Reverse().Do(r =>
                              {
                                rcount--;
                                r.Dispose();
                              })
              // Once resource is disposed it must not be 
              // iterated again and this what Share takes 
              // care of
              .Share();

            Action final = null;
            final = () =>
                      {
                        // Stop once every resource was given 
                        // a chance to dispose as Finally is 
                        // called even on empty sequences and 
                        // otherwise it leads to stack overflow
                        if (rcount > 0)
                        {
                          // Dispose only used resources and 
                          // leave untouched the rest
                          rs.Finally(final).Run();
                        }
                      };
            final();
          }
        })
      // Evaluate the sequence (triggers resources usage)
      .Run();
  }
}

Usage example below illustrates situation where during resource disposal an exception is thrown. In this case we must give chance to preceding (from resource sequence order perspective) resource to be disposed. However if an exception is thrown while disposing preceding resources that exception will hide previous one.

// Fake full of side effects resource =)
class Resource : IDisposable
{
  private readonly int m_i;

  public Resource(int i)
  {
    m_i = i;
    Console.WriteLine("Created {0}", m_i);
  }

  public void Use()
  {
    Console.WriteLine("Using {0}", m_i);
  }

  public void Dispose()
  {
    Console.WriteLine("Disposed {0}", m_i);
    // Simulate resource disposal that results in exception
    if (m_i % 2 == 1)
    {
      throw new Exception(m_i.ToString());
    }
  }
}

...

try
{
  Enumerable.Range(0, 5)
    .Using(i => new Resource(i),
           rs =>
             {
               // First resources 0, 1 and 2 are created 
               // and used
               rs.Take(3).Run(r => r.Use());
               // then already created resource 2 is used 
               // and resource 3 is created and used
               rs.Skip(1).Take(3).Run(r => r.Use());
             });
}
catch (Exception ex)
{
  // As resources are disposed in the opposite order
  // the latest exception is propagated
  Console.WriteLine("Exception {0}", ex.Message);
}

This produces the following output:

Created 0 // iterating, if not iterated previously resource is created
Using 0
Created 1
Using 1
Created 2
Using 2
Using 1   // otherwise reused
Using 2   // reused again
Created 3 // wasn’t iterated previously, created
Using 3
Disposed 3 // disposing in the opposite order, throws exception
Disposed 2 // still disposing continues
Disposed 1 // throws exception that hides exception thrown earlier
Disposed 0 // disposing continues
Exception 1 // exception is caught

That’s it! Hopefully you’ve enjoyed.

I hope we’ll meet next year. Happy New Year!

Wednesday, December 23, 2009

Chaining responsibilities

The idea behind Chain of Responsibility pattern is quite simple and powerful:

Avoid coupling the sender of a request to its receiver by giving more than one object a chance to handle the request. Chain the receiving objects and pass the request along the chain until an object handles it.

You can find lots of object oriented implementations out there so as an exercise we will rather try to do it in a more functional way. For simplicity Func<T, R> will be considered as handler contract. The basic idea looks like this:

Func<T, R> h = t =>
   {
       // Decide whether you can handle request
       bool canHandle = ...;
       // Get successor from somewhere
       Func<T, R> successor = ...;
       if (canHandle)
           // Handle request represented by t
       else
           // Delegate request to successor
           return successor(t);
   };

The first thing to solve is how to get successor. As handler must support composition it cannot simply create closure over successor. On the other hand it can be represented as function that returns actual handler closed over its successor:

Func<Func<T, R>, Func<T, R>> h = successor => t => 
   {
       bool canHandle = ...;
       if (canHandle)
           // Handle request represented by t
       else
           // Delegate request to closed over successor
           return successor(t);
   };

Now we need to compose handlers into a chain.

// Creates chain of responsibility out of handlers
static Func<T, R> Chain<T, R>(IEnumerable<Func<Func<T, R>, Func<T, R>>> handlers)
{
    // By default if none of handlers can handle incoming request an exception is thrown
    Func<T, R> notSupported = t => { throw new NotSupportedException(); };
    return Chain(handlers, notSupported);
}

// Creates chain of responsibility out of regular and default handlers
static Func<T, R> Chain<T, R>(IEnumerable<Func<Func<T, R>, Func<T, R>>> handlers, Func<T, R> def)
{
    // Assuming that order of handlers within the chains must be the same as in handlers sequence 
    return handlers
        // Handlers needs to be reversed first or otherwise they will appear in the opposite order 
        .Reverse()
        // Iteratively close each handler over its successor
        .Aggregate(def, (a, f) => f(a));
}

To make it more clear lets expand chaining of simple two handlers case:

// default handler
Func<int, int> d = x => x;
// two handlers appear in sequence in order of declaration
Func<Func<int, int>, Func<int, int>> h1 = s => t => t < 10 ? t*2 : s(t);
Func<Func<int, int>, Func<int, int>> h2 = s => t => t < 5 ? t + 3 : s(t);

// 1. Reverse handlers
// h2
// h1

// 2. Close h2 over d
// tmp1 = t => t < 10 ? t * 2 : d(t);
Func<int, int> tmp1 = h2(d); 

// 3. Close h1 over tmp1
// tmp2 = t => t < 5 ? t + 3 : tmp1(t);
Func<int, int> tmp2 = h1(tmp1); 

// 4. tmp2 is the chain

Now handlers are dynamically composed into chains to address particular scenario. 

As a chaining exercise let’s create the following application (a team of developers tries to handle a project):

  • Project is divided into a number of task of complexity that doesn’t exceed particular threshold.
  • In order to handle the project development team is staffed. A developer with skill X can handle task of complexity C when C <= X otherwise he contributes to task completion making task’s complexity smaller by X and delegates the rest of the task to next developer. Completed task is linked to developer who completed it.
  • If the team cannot handle particular task they ask for help for an external expert.

Prepare

// Staff development team that will do the project
static IEnumerable<Func<Func<int, int>, Func<int, int>>> Staff(int teamSize, int maxSkill)
{
    var rnd = new Random();
    for (int i = 0; i < teamSize; i++)
    {
        int dev = i;
        // Developers may differ in their skills
        int skill = rnd.Next(maxSkill);
        // If developer can handle incoming task he reports by returning his id that he completed the task
        // If not (not enough skills) he contributes to task and delegates to next developer smaller task
        yield return c => t => t <= skill ? dev : c(t - skill);
    }
}

// Create work break down structure for the project
static IEnumerable<int> Work(int projectSize, int maxTaskComplexity)
{
    var rnd = new Random();
    for (int i = 0; i < projectSize; i++)
    {
        yield return rnd.Next(maxTaskComplexity) + 1;
    }
}

and march to the end.

// Create tasks
var work = Work(projectSize, maxTaskComplexity).ToArray();
// If the team cannot handle particular task they ask for help unknown guru
Func<int, int> askForHelp = t => -1;
// Create chain out of developers to form a team with a backup
var team = Chain(Staff(teamSize, maxTaskComplexity), askForHelp);
// Hand out each task to the team
var project = from w in work
              select new {Task = w, CompletedBy = team(w)};

foreach(var status in project)
{
    Console.WriteLine("Task {0} completed by {1}", status.Task, status.CompletedBy);
}

Have chaining fun!

Monday, December 21, 2009

Making callbacks more explicit

Recall my previous post Events and Callbacks vs. Explicit Calls that outlined pros and cons of both (events and callbacks on one side and explicit calls on the other side) approaches.

Explicit calls imply that callee plays certain role from caller’s perspective thus making the relationship between the two explicit as well. Consider simple undo/redo example:

interface ICommand
{
    // Not every command can be undone. Returns true if it can be undone,
    // false - otherwise.
    bool CanUndo { get; }
    // Executes the command.
    void Do();
    // Reverts changes. NotSupportedException must be thrown in case 
    // command cannot be undone.
    void UnDo();
}

// Provides undo/redo mechanism.
class History
{
    // Makes supplied command part of the history.
    public void Remember(ICommand cmd)
    {
        // implementation is not relevant.
    }

    // other members elided.
}

Caller (History) clearly communicates its expectations by requiring callee (particular command) to conform with command’s contract (indivisible logical unit).

While being explicit makes code more clear it has its price. It requires consumers to create new types that conform to contract. It is not a problem if the type created will be reused by other parts of the application or it has complex logic. But it is also common that it may have trivial logic that is used only in caller (undo/redo mechanism in this case) scenario. In this case creating a new type sounds like an overhead.

I whish C# has capabilities (“Object Expressions”) similar to F# where I can implement interface “in place” like this:

let cmd = 
    { 
        new ICommand with 
            member x.CanUndo 
                with get() = false
            member x.Do() = Console.Write("Done") 
            member x.UnDo() = raise (new NotSupportedException())
    }

Although we can provide default implementation that uses delegates.

class ActionCommand : ICommand
{
    private readonly Action m_do;
    private readonly Action m_undo;

    public ActionCommand(Action doCallback, Action undoCallback)
    {
        if (doCallback == null)
        {
            throw new ArgumentNullException("doCallback");
        }
        m_do = doCallback;
        m_undo = undoCallback;
    }

    public bool CanUndo
    {
        get { return m_undo != null; }
    }

    public void Do()
    {
        m_do();
    }

    public void UnDo()
    {
        if (!CanUndo)
        {
            throw new NotSupportedException();
        }
        m_undo();
    }
}

While conforming to contract ActionCommand eases creation of lightweight scenario dedicated implementations “in place” avoiding types proliferation. But still the type must be discovered first by developers. It is negligible effort but it is still nonzero. In order to level this effort let the original consuming code do the job.

// Provides undo/redo mechanism
class History
{
    // Makes supplied command part of the history
    public void Remember(ICommand cmd)
    {
        // implementation is not relevant
    }

    // Makes command represented by pair of callbacks part of the history
    public void Remember(Action doCallback, Action undoCallback)
    {
        Remember(new ActionCommand(doCallback, undoCallback));
    }

    // other members elided
}

You should not put every possible “shortcut” into overloads but only the most commonly used one.

What benefits does this approach has? It benefits from being close to explicitly defined role making it easier to understand and use callback based API that is useful in case of lightweight single scenario use implementations.

Summary:

  • CONSIDER creating callback based overload next to overload that consumes corresponding abstraction

Thursday, November 5, 2009

Disposing sequence of resources

C# “using” statement has several advantages over its expanded equivalent:

  • Shortcut is more readable
  • If local variable form for resource acquisition is used it is read-only inside using statement and thus prevents you from spoiling resource disposal

Whenever you need to obtain several resources (number is known at compile time), use and then dispose them “using” statement is usually the choice:

using(var aStream = File.Open("a.txt", FileMode.Open))
{
    using(var bStream = File.Open("b.txt", FileMode.Open))
    {
        // Use both streams
    }
}

However it is not always the case. There may be a case when number of resources to obtain is not known at compile time. For example, basic external merge sorting algorithm separates large file into chunks (total number depends on original file size and available memory) that can be sorted in memory and then written to disk. Sorted chunks iteratively merged until a single chunk is left (which is sorted original file). During merge iteration several files must be opened (number is not known in advance), processed and then disposed. As we cannot use “using” statement directly it might look like this:

IEnumerable<string> files = ...; // Initialized elsewhere

var streams = new List<Stream>();
try
{
    // As we may get half way through opening
    // files and got exception because file doesn't
    // exist opened streams must be remembered
    foreach (var file in files)
    {
        streams.Add(File.Open(file, FileMode.Open));
    }

    // Use streams 
}
finally
{
    // Dispose opened streams
    foreach (var stream in streams)
    {
        stream.Dispose();
    }
}

Unfortunately we lost all advantages of “using” statement (looks messy and collection of opened streams or its contents can be modified before “finally” block). It would be nice to have something like this:

using (var streams = ???)
{
    // streams must be IEnumerable<Stream>
}

For reference types expansion of “using” statement looks like this (struct types differ in how resource is disposed):

using (ResourceType resource = expression) statement 

// is expanded to

{
    ResourceType resource = expression;
    try
    {
        statement;
    }
    finally
    {
        if (resource != null) ((IDisposable)resource).Dispose();
    }
}

If an exception happens during expression evaluation resource won’t be disposed (as there is nothing to dispose). However any exceptions inside statement are ok. So we need to somehow define how file names are converted into streams but still avoid any exceptions. Lazy evaluation will be handy.

// Projected sequence won’t get evaluated until it is enumerated
// and thus file related exceptions (if any) are also postponed
files.Select(file => File.Open(file, FileMode.Open))

Still we cannot use it inside “using” statement as it is not IDisposable. So basically what we want is a disposable sequence that takes care of disposing its elements (required to be IDisposable).

interface IDisposableSequence<T> : IEnumerable<T>, IDisposable
    where T:IDisposable
{ }

Sequence of disposable elements can be wrapped through

static class Disposable
{
    // Defined as an extension method that augments minimal needed interface
    public static IDisposableSequence<T> AsDisposable<T>(this IEnumerable<T> seq)
        where T:IDisposable
    {
         return new DisposableSequence<T>(seq);
    }
}

class DisposableSequence<T> : IDisposableSequence<T>
    where T:IDisposable
{
    public DisposableSequence(IEnumerable<T> sequence)
    {
       ... // an implementation goes here
    }
    
    ... // Other members elided for now
}

We are close. But there is subtle issue. Obtaining resource is a side effect. Enumerating multiple times through projected into resources sequence will result in unwanted side effects which of course must be avoided. In this particular case enumerating (and thus projecting it) through the same element (file name) more than once will attempt to open already opened file and result in exception as File.Open uses FileShare.None by default.

So we need to avoid side effects by memorizing obtained resources.

class DisposableSequence<T> : IDisposableSequence<T>
    where T : IDisposable
{
    private IEnumerable<T> m_seq;
    private IEnumerator<T> m_enum;
    private Node<T> m_head;
    private bool m_disposed;

    public DisposableSequence(IEnumerable<T> sequence)
    {
        m_seq = sequence;
    }

    public IEnumerator<T> GetEnumerator()
    {
        ThrowIfDisposed();

        // Enumerator is built traversing lazy linked list 
        // and forcing it to expand if possible
        var n = EnsureHead();
        while (n != null)
        {
            yield return n.Value;
            n = n.GetNext(true);
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    public void Dispose()
    {
        if (!m_disposed)
        {
            m_disposed = true;

            // As sequence creates enumerator it is responsible 
            // for its disposal
            if (m_enum != null)
            {
                m_enum.Dispose();
                m_enum = null;
            }

            // As it is possible that not all resources were 
            // obtained (for example, inside using statement 
            // only half of lazy evaluated sequence elements 
            // were enumerated and thus only half of resources 
            // obtained) we do not want to obtain them now
            // as they are going to be disposed immediately. 
            // Thus we traverse only through already created 
            // lazy linked list nodes and dispose obtained 
            // resources
            Dispose(m_head);

            m_seq = null;
        }
    }

    private Node<T> EnsureHead()
    {
        // Obtain enumerator once
        if (m_enum == null)
        {
            m_enum = m_seq.GetEnumerator();
            // Try to expand to first element
            if (m_enum.MoveNext())
            {
                // Created node caches current element
                m_head = new Node<T>(m_enum);
            }
        }
        return m_head;
    }

    private void ThrowIfDisposed()
    {
        if (m_disposed)
        {
            throw new ObjectDisposedException("DisposableSequence");
        }
    }

    private static void Dispose(Node<T> h)
    {
        if (h == null)
        {
            return;
        }

        try
        {
            // Disposing resources must be done in the opposite 
            // to usage order. With recursion it will have the 
            // same semantics as nested try{}finally{} blocks.
            Dispose(h.GetNext(false));
        }
        finally
        {
            h.Value.Dispose();
        }
    }

    class Node<V>
    {
        private readonly V m_value;
        private IEnumerator<V> m_enum;
        private Node<V> m_next;

        public Node(IEnumerator<V> enumerator)
        {
            m_value = enumerator.Current;
            m_enum = enumerator;
        }

        public V Value
        {
            get { return m_value; }
        }

        public Node<V> GetNext(bool force)
        {
            // Expand only if forced and not expanded before
            if (force && m_enum != null)
            {
                if (m_enum.MoveNext())
                {
                    m_next = new Node<V>(m_enum);
                }
                m_enum = null;
            }
            return m_next;
        }
    }
}

Once enumerated resources are memorized inside lazy linked list. It expands only more than already memorized resources are requested.

After putting things together our desired “using” statement usage looks like this

using (var streams = files.Select(file => File.Open(file, FileMode.Open)).AsDisposable())
{
    // streams is IEnumerable<Stream> and IDisposable
}

Enjoy!

Update.

In general it is a good practice to acquire resource right before its usage and dispose it when it is no longer needed otherwise system may experience resources exhaustion.

Described above approach can be used whenever resources should be acquired and disposed together (they all have the same actual usage time) and you do not know number of resources in advance. Otherwise you must use one or more "using" statements and dispose resources as they are no longer needed.

You must carefully consider that even if grouped under a single "using" statement (using described approach) resources have different actual usage time they won't be disposed (unless done explicitly inside "using" statement assuming that multiple calls to Dispose method are allowed) until processing of all resources is completed (holding some of them unnecessarily).

Saturday, October 31, 2009

Design focus on simplicity

Recently I did design review for a simple scenario that roughly corresponds to Command pattern extensible with inspection or modification of information prior to and subsequent to command execution. Simple design for a simple scenario. Though it wasn’t the case.

interface ICommand
{
    object Do(object arg);
}

interface ICommandInspector
{
    object BeforeCall(object arg);

    object AfterCall(object ret);
}

abstract class Command<T> : ICommand
    where T : ICommandInspector, new()
{
    public object Do(object arg)
    {
        var insp = new T();
        arg = insp.BeforeCall(arg);

        var ret = DoCore(arg);

        return insp.AfterCall(ret);
    }

    protected abstract object DoCore(object arg);
}

Besides Command pattern itself one more concept was introduced – command inspector that handles pre and post processing requirement. By the way does this concept sound familiar to you?

For example, the following inspector measures command execution time.

class PerformanceCounterInspector : ICommandInspector
{
    private readonly Stopwatch m_sw = new Stopwatch();

    public object BeforeCall(object arg)
    {
        m_sw.Start();
        return arg;
    }

    public object AfterCall(object ret)
    {
        m_sw.Stop();
        Console.WriteLine(m_sw.Elapsed);
        return ret;
    }
}

The solution addresses the requirements. So, where is the problem? Complexity! Let us not forget that managing complexity is primary topic in software development. Simpler solution that does the job should be preferable.

Proposed solution makes developers to divide tasks into command and inspector aspects although the only consumer of inspector concept is command itself and seems it can be a part of it. On the other hand it is not that easy to understand what inspector does unless you have good understanding of how it is consumed by the command. A complex solution to a simple problem.

We can try to keep solution as simple as possible from employed concepts perspective. How to employ only Command concept but still address requirements? This is where Decorator pattern is handy.

abstract class CommandDecorator : ICommand
{
    private readonly ICommand m_cmd;

    protected CommandDecorator(ICommand cmd)
    {
        m_cmd = cmd;
    }

    public virtual object Do(object arg)
    {
        return m_cmd.Do(arg);
    }
}

It is easier to understand the intent of added functionality as it is not separated from command concept.

class PerformanceCounterCommandDecorator : CommandDecorator
{
    public PerformanceCounterCommandDecorator(ICommand cmd)
        : base(cmd)
    {
    }

    public override object Do(object arg)
    {
        var sw = new Stopwatch();
        sw.Start();

        var ret = base.Do(arg);

        sw.Stop();
        Console.WriteLine(sw.Elapsed);
        return ret;
    }
}

The Decorator pattern naturally promotes dynamic functionality composition which is especially important when there are several ways to compose functionality. On the other hand decorator pattern promotes composition over inheritance (good thing as inheritance violates encapsulation and may lead to fragile designs). 

Summary:

  • DO NOT introduce new concepts into your design unless you have strong reasons to do so
  • DO always assess complexity of your design
  • DO strive for concepts that support composition

Tuesday, September 29, 2009

False promises

I do not like ads because the way it is made and presented makes me think that the advertisement tells me what product cannot do instead of what it can do (too many promises sound like no promises at all).

As a developer you advertise your component’s capabilities through its API. Unfortunately it so happens that we (developers) sometimes are no better than those “can do everything” magic product advertisers.

Consider the following public API (it is abstract but enough to illustrate the point):

class Container
{
    public void Put<T>(T item, string name);
    public T Get<T>(string name);
}

Let’s look at it assuming that the author had “Self-Documenting Object Models” principle in mind. What are your expectations? I would made assumptions like:

  • it is a container for named of items of arbitrary type
  • it won’t cause unnecessary boxing/unboxing for items of value types as both methods are generic
  • two items with the same name but of different type won’t overlap or otherwise want kind of behavior should I expect in cases like (I used explicit type parameter specification for illustration purposes)
    Container c = ...; // initialized elsewhere
    c.Put<int>(1, "one");
    var d = c.Get<string>("one");
    

And now let’s look at the implementation (I’ve seen similar things several times):

class Container
{
    Dictionary<string, object> items = new Dictionary<string, object>();

    public void Put<T>(T item, string name)
    {
        items[name] = item;
    }

    public T Get<T>(string name)
    {
        return (T)items[name];
    }
}

How many of our expectations are met? Only one – it is a container after all. It seems that public API advertised nonexistent capabilities. It turns out the author had different set of assumptions:

  • name is the only identifier of an item
  • generic methods allow to avoid additional casts

Wow! That is absolutely different story. Instead of advertised capabilities I got questionable advantage of not having to explicitly cast (is it really an advantage in this case?):

Container c = ...; // initialized elsewhere
var one = c.Get<int>("one");
// compared to
var one = (int)c.Get("one");

An API must be minimal and sufficient to fulfill its scenarios and clearly communicate type’s capabilities. Container for named items treated as objects may look like this:

class Container
{
    Dictionary<string, object> items = new Dictionary<string, object>();

    public void Put(object item, string name)
    {
        items[name] = item;
    }

    public object Get(string name)
    {
        return items[name];
    }
}

It doesn’t meet my original expectations but at least it makes no false promises which may lead to confusion.

Summary:

  • DO carefully consider usage scenarios
  • DO provide minimal and sufficient API that clearly communicates type’s capabilities

Sunday, August 30, 2009

Single Responsibility Principle – discovering violations

Single Responsibility Principle states:

A class should have only one reason to change. Responsibilities are reasons to change.

Violations of this principle leave you face to face with fragile design and all the maintenance nightmares it implies. Unfortunately there is no hundred percent way to prevent it from happening – it is just the nature of design.

Common sense often helps to spot multiple responsibilities mixed within a single class. However some cases are tricky and require careful attention. This is where design assessment comes in. An object can be looked at from two perspectives:

  • Data
    • What it knows?
    • What connections between objects it maintains?
  • Behavior
    • What it decides?
    • What services it performs?

Uncontrolled growth in any of the groups can lead to god class (data or behavior forms respectively).

What it knows?

Keeping related data and behavior in one place is essential to building consistent abstractions. Failing to do so may lead to:

  • unnecessary details disclosure (by exposing all the data necessary to implement behavior)
  • duplicated or inconsistent behavior (due to poor discoverability)
  • uncontrolled growth (as it is harder to see the big picture due to scattered behavior)

Following Information Expert pattern promotes the idea:

Assign a responsibility to the class that has the information needed to fulfill it.

Consider an example below where publicly visible person name could be incorrectly used to distinguish people (as names are not unique). Thus it makes sense to let Person type to define what equality means as it has all the necessary information (social security number). This will prevent introduction of several inconsistent equality behaviors across the code base.

class Person
{
    private string name;
    // Keep social security number in private
    private string ssn;
    
    // Names are not unique
    public string Name { get { return name; } }

    public override bool Equals(object obj)
    {
        if (obj == null || obj.GetType() != typeof(Person))
            return false;
        // Names cannot be used to identify people unlike 
        // social security number
        return ssn == ((Person) obj).ssn;
    }
}

Keeping related data and behavior in one place is as important as not letting not related sets of data/behavior to be put into the same class. Most of the methods defined on a class should be using most of the data members most of the time.

Many of you started to work still being a student. Employer needs your abilities to work and most likely he isn’t interested in that you still studying.

// Smart student - works and studies
class Student
{
    private string name;
    private int knowledge;
    private Func<Course, bool> preferences;
    private int experience;

    public string Name
    {
        get { return name;}
    }

    public void Study()
    {
        knowledge++;
    }

    public void Enlist(IEnumerable<Course> courses)
    {
        // Select appropriate courses and enlist
        foreach (var course in courses.Where(preferences))
            course.Enlist(name);
    }

    public void Work()
    {
        experience++;
    }

    public void Sign(Contract contract)
    {
        // sign job contract with your name
        contract.Agree(name);
    }
}

This class clearly has more than one responsibility and the fact that study related methods and work related methods both operate on a subset of data shows that. However separation can solve the problem.

class Student
{
    private Person person;
    private int knowledge;
    private Func<Course, bool> preferences;

    public Student(Person newStudent)
    {
        person = newStudent;
    }

    public void Study()
    {
        knowledge++;
    }

    public void Enlist(IEnumerable<Course> courses)
    {
        // Select appropriate courses and enlist
        foreach(var course in courses.Where(preferences))
            course.Enlist(person.Name);
    }
}

class Employee
{
    private Person person;
    private int experience;

    public Employee(Person newEmployee)
    {
        person = newEmployee;
    }

    public void Work()
    {
        experience++;
    }

    public void Sign(Contract contract)
    {
        // sign job contract with person's name
        contract.Agree(person.Name);
    }
}

What connections between objects it maintains?

What is connection between objects anyway? For example, a document can be seen a set of glyphs (text, graphics, structural elements like columns and rows). The connection between them is the fact they belong to the same document. So basically the document maintains connection between glyphs. So what must attract our attention in this area to spot Single Responsibility Principle violations?

If the object maintains connections between objects of different abstraction levels it is highly likely that it does someone’s job.

Consider an object model where a man wants to build a house. Man class maintains connection between tasks (got from an architect) and workers to build a house.  And this where different abstraction levels (from domain model perspective) meet – a man hired foreman who is supposed to manage construction but it so happens that a man wants to control everything and treats hired foreman as a simple worker giving him tasks to do in not necessarily correct order. Basically a man got foreman’s responsibility (or in other words foreman abstraction isn't consistent).

abstract class Task
{
    public bool Completed { get; private set; }

    public virtual void Do()
    {
        DoCore();
        Completed = true;
    }

    protected abstract void DoCore();
}

class Worker
{
    public virtual bool CanDo(Task task)
    {
        // Can do everything worker =)
        return true;
    }

    public virtual void Accept(Task task)
    {
        task.Do();
    }
}

class Foreman : Worker
{
    IEnumerable<Worker> workers;
    Func<Task, bool> next;

    public override bool CanDo(Task task)
    {
        // Can do things only on time or otherwise the house 
        // won't stand long
        return next(task);
    }  

    public override void Accept(Task task)
    {
        // Foremen looks for a worker who can do the task
        workers.Where(w => w.CanDo(task)).First().Accept(task);
    }
}

// A man who tries to manage his new house construction
class Man
{
    IEnumerable<Task> construction;
    Foreman foremen;

    public void BuildHouse()
    {
        IEnumerable<Task> tasks;
        // As far as we don't know what to do next we'll ask foreman
        while ((tasks = construction.Where(t => foremen.CanDo(t) && !t.Completed)).Any())
        {
            foreach (var task in tasks)
            {
                foremen.Accept(task);
            }
        }
    }
}

A man must let foreman to manage tasks or become a foreman himself (we’ll take first approach or otherwise the house won't stand long).

class Foreman
{
    IEnumerable<Worker> workers;
    Func<Task, bool> next;
    
    public void Manage(IEnumerable<Task> construction)
    {
        IEnumerable<Task> tasks;
        // Foreman selects tasks that are not completed and must be done next
        while((tasks = construction.Where(t => !t.Completed && next(t))).Any())
        {
            foreach (var task in tasks)
            {
                workers.Where(w => w.CanDo(task)).First().Accept(task);
            }
        }
    }
}

class Man
{
    IEnumerable<Task> construction;
    Foreman foremen;

    public void BuildHouse()
    {
        // Foreman takes the whole construction to manage
        foremen.Manage(construction);
        // Just check that everything is done
        if (construction.Where(t => !t.Completed).Any())
        {
            throw new InvalidOperationException("Something isn't done!");
        }
    }
}

Now abstractions are leveled - man allows construction to be managed by foreman (foreman abstraction is now consistent).

What it decides?

It is quite uncomfortable when someone asks you for your opinion and then makes his own mind and tells you what and how to do. Why don’t tell what to do in the first place and let me decide how to do it. Isn’t that looking like you are doing my job? This is what Tell Don’t Ask Principle is about:

As the caller, you should not make decisions based on the called object’s state and then change the object’s state.

Here is couple of teenager definitions – which do you think makes more sense?

class Teenager
{
    private int age;
    private List<string> clothes;

    // Is it age that drives your clothes preferences?
    public int Age { get { return age; } }

    // Do you really wan't your parents to dress you 
    // just based on your age?
    public void TakeNew(string clothing)
    {
        clothes.Add(clothing);
    }
}

// ... or

class Teenager
{
    private List<string> clothes;
    // Preference is something personal
    private Func<string, bool> preference;

    // And you want to able to select clothes based on your 
    // preferences?
    public bool SelectNew(IEnumerable<string> shop)
    {
        var clothing = shop.Where(preference).FirstOrDefault();
        if (clothing != null)
        {
            // Got something you like
            clothes.Add(clothing);
        }
        // Tell your parents whether you need to go to other shop =)
        return clothing != null;
    }
}

By violating Tell Don’t Ask principle caller gains responsibility on make decision instead of the called one. Busted!

What services it performs?

Service provider can do the job by itself or ask collaborators for help. Keeping dependencies on collaborators explicit facilitate controlled class growth. Classes should not contain more objects than a developer can fit in his or her short-term memory (5-7). Otherwise it is possible to introduce responsibilities into class which are either duplicated or not related. Whenever number of collaborators growth more than 5-7 you should consider whether all collaborators still help you to form single abstraction or a new one was introduced.

On the other hand it makes sense to look at consumers and in particular on how they consume supplied services. If consumers use different subsets of provided services it may be a sign that class captures more than one responsibility.

Recall the student example. Original class has two consumers: employer and university (for example). Both consumers were interested in a subset of provided methods (working and studying related respectively). And as we discovered the class captured two abstractions: employee and student.

Summary:

  • DO pay careful attention to responsibilities assignment
  • CONSIDER decomposing class into areas (what it knows, maintains, does and decides) and performing assessment in each area

Thursday, August 13, 2009

Inject or locate dependencies?

Inversion of Control pattern allows to decouple components (consumers) from their dependencies and takes care of dependencies location and lifetime management through delegation of these responsibilities to external (with respect to dependent type) component. This pattern actively used in composite application development.

Inversion of Control comes in two flavors (Unity provides both capabilities):

Service locator holds references to services and knows how to locate them. It is further used by dependent component to obtain necessary services. In other words consumers play active role.

interface IService
{
    void Do();
}

class ActiveConsumer
{
    private readonly IUnityContainer locator;

    // Reference to service locator comes from outside
    public ActiveConsumer(IUnityContainer serviceLocator)
    {
        locator = serviceLocator;
    }

    public void Do()
    {
        // In order to fulfill its task active consumer relies 
        // on service implementation that is obtained on demand 
        // from service locator
        var service = locator.Resolve<IService>();
        service.Do();
    }
}

Dependency injection makes dependent components passive (little or no work is done to get its dependencies). The only responsibility consumers still care about is to express their dependencies somehow (the way dependencies are expressed depends on pattern implementation, but for this example we will use single constructor automatic injection supported by Unity).

class PassiveConsumer
{
    private readonly IService service;

    // This constructor is used to inject service dependency
    public PassiveConsumer(IService svc)
    {
        service = svc;
    }

    public void Do()
    {
        // We got this dependency from outside and done nothing 
        // to let it happen - so just use it
        service.Do();
    }
}

...

// At this point container resolves consumer's dependency 
// and injects it during construction 
var passiveConsumer = container.Resolve<PassiveConsumer>();
passiveConsumer.Do();

So what is the difference?

First, is dependency from service locator appropriate? If the component in question is supposed to be reused by others you may end up with putting unnecessary constraints (for example you are using some open source service locator but developers that could reuse your component are not allowed to use any open source stuff due to customer’s demand and thus won’t be able to reuse the component).

Second, dependencies visibility. Service locator makes consumer’s “real” dependencies hidden and dependency from service locator itself visible. When dependencies are explicit it is much easier to understand dependent class. Explicit dependencies allows you to assess and control the growth of the component. For example, if your component accepts 10 services in its constructor it may be a sign that it does, or knows or decides too much and it is time to split it. Consider the same thing when using service locator. In order for you to spot number of dependencies you need to look for all unique usage occurrences of service locator. It is not that hard with modern IDE but still it is not that easy as looking at component’s contract.

On the other hand, it makes sense to consider the audience of the component. If it will be reused by others and dependencies are hidden it may require deep knowledge of component’s inner workings in order to use it.

Third, consumer’s relation with dependency. Dependency injection promotes constant relations (from lifetime perspective). Consumer obtains its dependency at construction time and lives with it. On the other hand service locator compasses to temporary relations – get service instance when it is time, call its methods, discard it. Why discard? Because if the component has a constant relation why not use dependency injection otherwise which gives you explicit dependencies?

But anyway, what particular case forces locator usage? When consumer has longer lifetime than its dependency. For example, you are writing smart client application. You organized presentation layer using Model-View-Presenter pattern. Presenter calls remote service in response to user interaction. View controlled by a presenter can be opened for a long time. If presenter gets remote service proxy dependency only once it may happen that proxy will go into faulted state (for example, due to network connectivity problems) and any subsequent calls to it will result in exception. So it is better to dispose proxy every time a particular task accomplished and create new one when new task is on the way or cache it and in response to proxy going faulted create a new one (which is of course harder as long as you need to handle all cases where it used and maintain cache). Thus it seems that service locator is more appropriate in this case.

However we can make short lived dependencies explicit. Let’s assume that IService implementation instance must be disposed every time it is used.

interface IService : IDisposable
{
    void Do();
}

// This is still active consumer as it uses service locator to get service instance
class Consumer
{
    private readonly IUnityContainer locator;

    public ActiveConsumer(IUnityContainer serviceLocator)
    {
        locator = serviceLocator;
    }

    public void Do()
    {
        using (var service = locator.Resolve<IService>())
        {
            service.Do();
        }
    }
}

Service locator has wide surface (it terms of services it can provide) and this makes consumer’s contract opaque. What we need to do is narrow the surface but still provide ability to create service instances (as long as we need to dispose them every time). Abstract factory will do the thing. Factory provides clear connection with service it creates. On the other hand we need to make consumer’s dependency from factory explicit. We will use dependency injection.

interface IServiceFactory
{
    IService CreateService();
}

// Consumer is no longer active as it gets its dependencies from outside
class Consumer
{
    private readonly IServiceFactory factory;

    // The dependency is now explicit
    public PassiveConsumer(IServiceFactory serviceFactory)
    {
        factory = serviceFactory;
    }

    public void Do()
    {
        // We still can create service instances on demand
        using (var service = factory.CreateService())
        {
            service.Do();
        }
    }
}

How about this? That is not all. Factory clearly and explicitly states the relation between dependent component and dependency (service that is created by factory) – it is a temporary relation (as long as it provides ability to create new instances).

Summary:

  • DO make dependencies explicit
  • DO use dependencies to assess and control growth of the dependent component
  • CONSIDER component audience when choosing between dependency injection and service locator
  • CONSIDER using abstract factory pattern and dependency injection to make short lived dependencies (in comparison with dependent component lifetime) explicit

Wednesday, August 5, 2009

Type parameter leveling

At some point developers have to decide whether to use generic class or generic method (or in other words whether to put type parameter on type level or method level).

Type parameter at type level (generic type type parameter) allows to enforce constraints for multiple members that use this type parameter. For example, IEnumerable<T> fixes what kind of enumerators it can produce to IEnumerator<T>.

public interface IEnumerable<T> : IEnumerable
{
    IEnumerator<T> GetEnumerator();
}

Generic type may be used when it:

  • implements generic interface
  • has internal state that depends on type parameter
  • type parameter is implicitly used as an internal state

You can also look at the alternatives from granularity point of view. Type parameter of a generic type has wider influence than type parameter of generic method. If type parameter makes sense only for a single member it doesn’t makes sense to put type parameter at a higher level than member that uses it. Otherwise you will force consumers of members to depend on type parameter and its constrains they may not use.

static class Util<T>
    where T: IComparable<T>
{
    public static T Min(T left, T right)
    {
        return left.CompareTo(right) < 0 ? left : right;
    }
    
    public static void Swap(ref T left, ref T right)
    {
        var tmp = left;
        left = right;
        right = tmp;
    }
}

...

int i1 = 10, i2 = 15;
var min = Util<int>.Min(i1, i2);

object o1 = i1; object o2 = i2;
// Not allowed and results in compilation error
Util<object>.Swap(ref o1, ref o2);

In the case above type parameter introduces constrain which makes sense only for Min method and thus limits capabilities of Swap method (Swap can operate on any argument and Min can only operate on comparable arguments). On the other hand generic type argument must be specified explicitly and it doesn’t use advantages of type inference which simplifies usability of the API.

static class Util
{
    public static T Min<T>(T left, T right)
        where T: IComparable<T>
    {
        return left.CompareTo(right) < 0 ? left : right;
    }

    public static void Swap<T>(ref T left, ref T right)
    {
        var tmp = left;
        left = right;
        right = tmp;
    }
}

...

int i1 = 10, i2 = 15;
var min = Util.Min(i1, i2);

object o1 = i1; object o2 = i2;
// Now it is perfectly legal
Util.Swap(ref o1, ref o2);

Changing type parameter level with respect to members that depend on it within inheritance hierarchy may lead to Design by Contract and Liskov Substitution Principle violation.

I came across an API which do the trick recently. In the abstracted example below ISomething states that its DoSomethingWith method can do something with enumerable of any T.

interface ISomething
{
    void DoSomethingWith<T>(IEnumerable<T> target);
}

But the implementation provides specialization on a fixed type parameter. It changed type parameter level from method to type level – DoSomethingWith method implementation depends on generic type type parameter whereas interface states method’s dependency from method type parameter.

class FixedSomething<U> : ISomething
{
    private Func<U, bool> predicate;
    private IEnumerable<U> slice = Enumerable.Empty<U>();

    public void DoSomethingWith<T>(IEnumerable<T> target)
    {
        if (typeof (U) != typeof (T))
            // Cannot do something on argument of type T. U type is expected.
            throw new ArgumentException();

        slice = target.Cast<U>().Where(predicate).Concat(slice);
    }

    //... other members elided for clarity 
}

It violates ISomething contract as it strengthens input parameter type constrain and introduces new exception that may be thrown. On the other hand it performs runtime type check which violates LSP.

Summary:

  • DO use generic type when it implements generic interface or has internal state that depends on type parameter.
  • CONSIDER type parameter granularity.
  • DO NOT change type parameter level within inheritance hierarchy.

Monday, July 27, 2009

Generic method type parameter as Type parameter

Generics is another mechanism offered by the common language runtime (CLR) and programming languages that provides one more form of code re-use: algorithm re-use.
Jeffrey Richter, CLR via C# 2nd

However algorithm abstraction from data types it operates on is not the only usage scenario. Type parameter of a generic method can be treated as a parameter of Type. Consider the following example

// T IUnityContainer.Resolve<T>()
IService service = unityContainer.Resolve<IService>();

Unity container implementation uses supplied type parameter to resolve an instance of requested type from the container. It is beneficial as it allows us to read code naturally and it avoids unnecessary type casts.

It is quite often used when building fluent interfaces (in particular Method Chaining). Consider autofac container configuration example:

var builder = new ContainerBuilder();
builder.Register<Straight6TwinTurbo>().As<IEngine>().FactoryScoped();

The configuration can be read as a natural language sentence.

However it is only half of the truth. There are several scenarios where this approach doesn’t work so well.

Not all languages support generics, others may have syntax that makes them hard to read. For example the same thing written in VB.NET doesn’t have the same readability as in C#:

Builder.Register(Of Straight6TwinTurbo)().As(Of IEngine)().FactoryScoped();

On the other hand it is beneficial when type parameter is known at compile time and it may complicate more dynamic scenarios when it is deferred until runtime.

For example, Unity container has an Extension called StaticFactoryExtension. The purpose of this extension is to add the ability to register types within the container while deferring the instantiation of the type to a factory method. The extension is configured through IStaticFactoryConfiguration:

public interface IStaticFactoryConfiguration : IUnityContainerExtensionConfigurator
{
    IStaticFactoryConfiguration RegisterFactory<TTypeToBuild>(FactoryDelegate factoryMethod);
    IStaticFactoryConfiguration RegisterFactory<TTypeToBuild>(string name, FactoryDelegate factoryMethod);
}

It is used like this:

container.AddNewExtension<StaticFactoryExtension>();
IStaticFactoryConfiguration config = container.Configure<IStaticFactoryConfiguration>();
config.RegisterFactory<IProductService>(ProductServiceFactory.Create);

Note however that the factory registration API includes only generic methods. It makes harder to implement scenarios were calling code wants to enumerate types at runtime and register particular factory for them.

In order to solve this problem the API should provide non generic version in addition to generic one. For example, Unity container provides both (generic and non-generic) versions for Resolve method:

public interface IUnityContainer
{
    T Resolve<T>();
    object Resolve(Type t);

    //... other members elided for clarity
}

Summary:

  • CONSIDER audience of your API
  • CONSIDER providing non generic version of your API taking into account your audience and usage scenarios

Thursday, July 16, 2009

Scenario in terms of abstraction levels

Framework Design Guidelines promotes several principles but one of them is worth special attention:

Frameworks must be designed starting from a set of usage scenarios and code samples implementing these scenarios.

It allows framework designers to see their API from consumer’s perspective and the value the sight produces should not be underestimated. The framework designers are experts in subject area and thus they are cursed by this knowledge. That is why it is highly important to express mainline usage scenarios in terms that consumers are familiar with or can get easily to know (of course advanced scenarios may require much deeper knowledge in the subject area).

Unfortunately designers quite often rely on standard design methodologies (including object-oriented design methodologies) that are oriented more on maintainability than on usability of the API.

Recently I came across an API (or basically an approach promoted over the application) that is responsible for data persistence. Basically in common scenarios data persistence is used in both directions (read/write). From consumer’s perspective it looks like a single scenario. Suggested approach assumes that data is loaded through loaders and data changes are propagated back to data source using Unit of Work pattern:

// Represents some company information
internal class Company
{
    public int Id { get; set; }
    public int TrustLevel { get; set; }
}

// Loaders are responsible for data reading.
internal interface ICompanyLoader
{
    // Loads limitted set of data about companies that has highest 
    // profits.
    IEnumerable<Company> LoadTopPerformingCompanies(int count);
}

// Here goes Unit of Work pattern.
internal interface IUnitOfWork
{
    // Methods that are responsible for state changes tracking and writing out changes 
    // coordination.

    void RegisterNew(object obj);
    void RegisterDirty(object obj);
    void RegisterClean(object obj);
    void RegisterDeleted(object obj);
    void Commit();
    void Rollback();
}

Used like this:

// Loading data which is expressed in terms of business specific
var companies = companyLoader.LoadTopPerformingCompanies(10);

// Updating data which is expressed in terms of a selected 
// persistence layer implementation
foreach (var company in companies)
{
    company.TrustLevel++;
    unitOfWork.RegisterDirty(company);
}
unitOfWork.Commit();

So where is the problem? It seems working…

As mentioned from consumer’s perspective this is a single scenario. However the API expresses single scenario in terms of different abstraction levels:

  • Data loading is expressed in business specific terms (load top performing companies – companies that has profits for the last year more than the others)
  • Data updating is expressed in terms of state management (register object as dirty - updated)

This is the catch. In order for consumer to implement the scenario one needs to switch between abstractions which dramatically reduces productivity and on the other hand readability of the code.

In order to avoid mentioned problems a single level of abstraction must be used:

// It may be Table Data Gateway pattern 
interface ICompanyGateway
{
    // Loads limitted set of data about companies that has highest 
    // profits.
    IEnumerable<Company> LoadTopPerformingCompanies(int count);

    // Updates companies trust
    void IncreaseCompaniesTrust(IEnumerable<Company> companies);
}

...

// Loading and updating data which is expressed in terms of business specific
var companies = companyGateway.LoadTopPerformingCompanies(10);
companyGateway.IncreaseCompaniesTrust(companies);

Summary:

  • DO NOT mix different abstraction levels within single scenario
  • DO provide expected by the consumer terms required to implement the scenario

Friday, June 19, 2009

Commonality that is worth abstractions

Unfortunately desire to achieve reusability not always leads to expected results. It can bring you development effort savings through reusing existent components or it can lead to negative consequences (such as proliferation of unnecessary classes, decreased discoverability and readability).

Reusability is often based on abstractions. Abstraction captures some commonality between its concrete implementations. But abstractions being very powerful are dangerous as well.

Consider situation where a piece of work is generalized and represented through Command pattern. Basically task (commonality) resulted in command (abstraction).

interface ICommand
{
  // Performs command's task
  void Do(object parameter);
}

Is this commonality worth the abstraction? It depends on the rationale behind.

It may result from desire to factor out functionality that belongs to a particular category (for example, business rules) and through implementing common interface outline membership in a category (and of course it may be expected that through abstraction they will be reusable in some other scenarios). However with this rationale at hand it is most likely that code that creates command also executes it (because it is the only scenario for now – more other scenarios are coming :-)).

// Holds arguments for SomeCommand
class Arguments
{
  public string Arg1 { get; set; }
  public int Arg2 { get; set; }
}

class SomeCommand : ICommand
{
  public void Do(object parameter)
  {
      var args = (Arguments) parameter;
      // Do something with arguments
  }
}
var command = new SomeCommand();
var args = new Arguments {Arg1 = "value", Arg2 = 10};
command.Do(args);

Unfortunately this rationale doesn’t justify introduction of command abstraction. It forces calling code to consume general purpose opaque contract despite the fact that calling code knows exactly what must be done. It decreases readability in comparison with direct call to a method with comprehensive name and may result in proliferation of unnecessary classes which are basically arguments containers. Thus the abstraction introduced drawbacks with no benefits.

It is as important to resist premature abstractions as the abstraction itself. Every abstraction even not used once is not free - it has maintenance costs attached since abstractions are considered to be a contract between the author and consumers that must be fulfiled.

Anytime you introduce a new abstraction a concrete implementation of the abstraction and consuming API must be provided as well. It is hard to see whether abstraction designed correctly unless there is a use case implemented that uses the abstraction.

Consider when calling code cannot execute commands itself but rather commands must executed in order of appearance (through a queue).

// Executes commands in order
interface CommandQueue
{
  // Enqueues command with parameter that is known
  // at the time of enqueue
  void Enqueue(ICommand command, object parameter);

  // Enqueues command with function that will provide
  // parameter on demand
  void Enqueue(ICommand command, Func<object> parameter);
}

In this case abstraction introduction is justified. The command queue conforms to Open/Closed Principle - it is easily extensible with new types of commands because it relies on Command abstraction.

If classes share only a common interface (not behavior), then they should inherit from a common base class or interface only if they will be used polymorphically.

Summary:

  • AVOID premature abstractions
  • DO carefully justify introduction of new abstractions
  • DO provide concrete implementation and consuming API for every abstraction introduced

Thursday, June 11, 2009

Events and Callbacks vs. Explicit Calls

From API design perspective interaction between components can be expressed in request/reply (explicit calls) or in event driven manner (events and callbacks). Consider BackgroundWorker usage example:

// BackgroundWorker executes an operation on a separate thread.
// The operation is basically an event handler of DoWork event
var worker = new BackgroundWorker();
// Once worker will be run this event will be raised in separate thread
worker.DoWork += (sender, e) =>
                {
                   // Event handler does the actual work
                   // The event is raised in a separate thread
                };
worker.RunWorkerCompleted += (sender, e) =>
                            {
                               // Performs post processing step
                               // The event is raised through sycnhronization
                               // context
                            };
// Runs worker
worker.RunWorkerAsync();

Why .NET Framework designers selected event based interaction model instead of request/reply (see hypothetical example below)?

// Represents unit of background work
internal interface IBackgroundWork
{
    // Performs actual work and will be called
    // in a separate thread
    object Do();
    // Performs post processing step and will be
    // called through sycnhronization context
    void Complete(object result);
}

internal class SomeBackgroundWork : IBackgroundWork
{
    public object Do()
    {
        // Do the actual work and return result
        return ...;
    }

    public void Complete(object result)
    {
        // Do post processing
    }
}

internal class BackgroundWorker
{
    // Runs work unit asynchronously
    public void RunWorkerAsync(IBackgroundWork work)
    {
        // elided
    }
}
var worker = new BackgroundWorker();
worker.RunWorkerAsync(new SomeBackgroundWork());

The short answer is that it is about the relationship between participants (worker and provider) and in particular the direction of the dependency. Dependency direction is identified by the domain that holds participants.

Consider IComparable<T> interface. This interface is implemented by types whose instances can be ordered. Implementing this interface on a type states that it supports comparison. On the other hand GenericComparer<T> uses IComparable<T> to compare instances.

[Serializable]
internal class GenericComparer<T> : Comparer<T>
    where T : IComparable<T>
{
    public override int Compare(T x, T y)
    {
        if (x != null)
        {
            if (y != null)
            {
                return x.CompareTo(y);
            }
            return 1;
        }
        if (y != null)
        {
            return -1;
        }
        return 0;
    }

    // othe members elided
}

Explicit calls clearly express the relationship and the consequences of the action but it limits extensibility to compile time and puts more work on API consumers because it requires them to create new types. In the "explicit" background worker example consumer will need to create new type for its unit of work. And it is highly likely that it won't be used in other places.

Consider ThreadPool class. It provides ability to execute arbitrary code on a separate thread. ThreadPool is not interested to know its consumers but it is the consumers who depend on ThreadPool to perform a task. It uses callbacks mechanism to execute arbitrary code within thread pool.

ThreadPool.QueueUserWorkItem(state =>
                            {
                             // Does actual work
                            });

Events and callbacks do not create explicit dependency and thus allows for runtime customization. Events has great tool support. It puts less work on consumers because it doesn’t require to create new types. But these benefits come at a price. It is hard to understand what happens just from looking at the code - in order to understand it all subscriptions to events must be seen. The event triggering may result in an events cascade where handling code raises next event thus creating a chain of events. From performance perspective events and callbacks are more heavyweight than explicit calls.

Summary:

  • CONSIDER using explicit calls to express the relationship (consuming several events or callbacks to perform single task may be a sign that there is role to discover)
  • CONSIDER using explicit calls if there is a potential to reuse called code in other use cases
  • CONSIDER using explicit calls to increase testability
  • AVOID expressing complex interaction using events or callbacks which will be really hard to understand
  • CONSIDER using events and callbacks to allow runtime customization and avoid undesired dependency

Friday, June 5, 2009

When member overload backfires

I do believe that quality of software design isn’t based solely on appropriate high level decisions. It is detailed quality decisions made on a smaller scale that make it possible.

The API influences consuming code. Making it obvious and simple helps to avoid creating hard to read and write error-prone code.

Recently I caught myslef writing the API presented below (I fixed it, honestly :-)):

// Represents basic document template - named placeholders can be mapped to data and
// resulting document is generated
public class DocumentTemplate
{
       // Maps named placeholder to data item
       public void Map<T>(string placeholderName, T data)
       {
              // Placeholder will be replaced with textual representation of the data
       }

       // Maps named placeholder to collection of data items - this is a repeatable placeholder
       public void Map<T>(string placeholderName, IEnumerable<T> data)
       {
              // Placeholder is treated as a repeatable piece
              // Data collection will be enumerated and for each item placeholder body will
              // duplicated and replaced with textual representation of the item
       }

       // ... Other non-relevant members are omitted
}

Consider code snippet below – it uses type inference. Can you tell which overload will be called?

TemplateMapping templateMapping = ...
var list = new List<int> {1, 2};
templateMapping.Map("v1", list); // #1
templateMapping.Map("v2", list.AsEnumerable()); // #2

It does require you to know overload resolution rules to figure out which overload will be called. In #1 case Map<List<int>>(…) will be called and Map<int>(…) in #2.

But worse it can lead to hidden bugs. Suppose we have a method like this

static IEnumerable<int> GetCollection()
{
       return new List<int> { 1, 2 };
}

And it is used with buggy-prone templating API

TemplateMapping templateMapping = ...
templateMapping.Map("v", GetCollection());

In this case Map<int>(…) is called. But suppose that the author of the GetCollection() method decided to change return type to IList<int> - other overload will be called!

It should be clear from looking at the calling code which overloaded member will be called.

In order to solve this you can for example specify manually which overload must be used

templateMapping.Map<int>("v", GetCollection());

But wait. This API has a more fundamental problem. Overloaded members are meant to be semantically equivalent. But in this particular case mapping semantic changes based on parameters types! Latter example shows case where your application may behave differently.

This problem can be solved through giving each semantically different member unique descriptive name. For example,

// Maps named placeholder to data item
public void MapSingle<T>(string placeholderName, T data)

// Maps named placeholder to collection of data items - this is a repeatable placeholder
public void MapEnumerable<T>(string placeholderName, IEnumerable<T> data)

This will also allow to avoid overload resolution problems mentioned above.

Summary:

  • AVOID making overloads that are semantically different
  • DO NOT require from developers deep knowledge of overload resolution rules in order to use your API