Monday, February 8, 2010

Binary heap based priority queue

Design of container that supports items ordering raises lots of interesting design questions to consider. To be more concrete we will design simple priority queue based on binary min heap that supports the following operations:

  • Enqueue – add an item to the queue with an associated priority.
  • Dequeue - remove the element from the queue that has the highest priority and return it.
  • Peek – look at the highest priority element without removing it.

Good design that solves wrong problems isn’t better than the bad one. So the first step is to identify right problems to solve. Priority queue maintains set of items with associated key (priority). Items get off the queue based on employed ordering mechanism for keys (priorities). Basically the two problems we need to solve (from API design perspective) are the ways to represent:

  • association of a key (priority) and corresponding item
  • ordering mechanism for keys (priorities)

Association can be either explicit (PriorityQueue<TItem, TKey>, where key type is explicitly stated) or implicit (PriorityQueue<TItem>, where key type is of no interest). Each type parameter must have concrete consumers. Priority queue itself doesn’t care (although priority queues with updateable priority do) about keys but rather about comparing keys. Client code cannot benefit from explicit keys as well because it can easily access associated key as the client code defines what key actually means. So there is no point in cluttering API with irrelevant details (of what keys really are). Thus we will use PriorityQueue<T> (as now we have the only type parameter we will use short name for it) and let consumers provide comparison logic of two items based on whatever consumer defines as keys.

There are several options to represent comparison mechanism.

Item type may be constrained to support comparison through generic type parameter constraint:

class PriorityQueue<T> 
  where T : IComparable<T>
{
}

Though this approach benefits from clearly stated comparison mechanism it implies significant limitations:

  • It doesn’t support naturally comparison of items of the same type using different aspects (for example, in one case objects of Customer type are compared using number of orders and in the other – using date of last order). Of course consumer can create lightweight wrapper that aggregates object to compare and does actual comparison but it is not feasible from additional memory consumption and additional usage complexity perspectives.
  • It doesn’t support naturally changing order direction (ascending <-> descending) and thus it may require adding support into the data structure itself.

With those limitations in mind we can use comparers – something that knows how to compare two objects:

  • A type that implements IComparer<T> which benefits from .NET Framework support (it provides great documentation support and default implementation).
  • or a delegate Func<T, T, int> that accepts two parameters of type T and returns integer value indicating whether one is less than, equal to, or greater than the other. It benefits from anonymous functions conveniences.

Comparers are designed for particular usage scenarios and single instance corresponds to items container. Thus limitation mentioned above are not applied to comparers.

Taking into account value of .NET Framework support for IComparer<T> and that it is easy to create wrapper that derives from Comparer<T> and delegates comparison to aggregated function we will use IComparer<T> approach (although it seems costless to add also support for Func<T, T, int> mechanism and create wrapper ourselves in most cases it is best to avoid providing means to do the same thing in multiple ways or otherwise potential confusion may outweigh benefits).

Now putting everything together.

// Unbounded priority queue based on binary min heap
public class PriorityQueue<T>
{
  private const int c_initialCapacity = 4;
  private readonly IComparer<T> m_comparer;
  private T[] m_items;
  private int m_count;

  public PriorityQueue()
    : this(Comparer<T>.Default)
  {
  }

  public PriorityQueue(IComparer<T> comparer)
    : this(comparer, c_initialCapacity)
  {
  }

  public PriorityQueue(IComparer<T> comparer, int capacity)
  {
    Contract.Requires<ArgumentOutOfRangeException>(capacity >= 0);
    Contract.Requires<ArgumentNullException>(comparer != null);

    m_comparer = comparer;
    m_items = new T[capacity];
  }

  public PriorityQueue(IEnumerable<T> source)
    : this(source, Comparer<T>.Default)
  {
  }

  public PriorityQueue(IEnumerable<T> source, IComparer<T> comparer)
  {
    Contract.Requires<ArgumentNullException>(source != null);
    Contract.Requires<ArgumentNullException>(comparer != null);

    m_comparer = comparer;
    // In most cases queue that is created out of sequence 
    // of items will be emptied step by step rather than 
    // new items added and thus initially the queue is 
    // not expanded but rather left full
    m_items = source.ToArray();
    m_count = m_items.Length;
    // Restore heap order
    FixWhole();
  }

  public int Capacity
  {
    get { return m_items.Length; }
  }

  public int Count
  {
    get { return m_count; }
  }

  public void Enqueue(T e)
  {
    m_items[m_count++] = e;
    // Restore heap if it was broken
    FixUp(m_count - 1);
    // Once items count reaches half of the queue capacity 
    // it is doubled 
    if (m_count >= m_items.Length/2)
    {
      Expand(m_items.Length*2);
    }
  }

  public T Dequeue()
  {
    Contract.Requires<InvalidOperationException>(m_count > 0);

    var e = m_items[0];
    m_items[0] = m_items[--m_count];
    // Restore heap if it was broken
    FixDown(0);
    // Once items count reaches one eighth  of the queue 
    // capacity it is reduced to half so that items
    // still occupy one fourth (if it is reduced when 
    // count reaches one fourth after reduce items will
    // occupy half of queue capacity and next enqueued item
    // will require queue expand)
    if (m_count <= m_items.Length/8)
    {
      Expand(m_items.Length/2);
    }

    return e;
  }

  public T Peek()
  {
    Contract.Requires<InvalidOperationException>(m_count > 0);

    return m_items[0];
  }

  private void FixWhole()
  {
    // Using bottom-up heap construction method enforce
    // heap property
    for (int k = m_items.Length/2 - 1; k >= 0; k--)
    {
      FixDown(k);
    }
  }

  private void FixUp(int i)
  {
    // Make sure that starting with i-th node up to the root
    // the tree satisfies the heap property: if B is a child 
    // node of A, then key(A) ≤ key(B)
    for (int c = i, p = Parent(c); c > 0; c = p, p = Parent(p))
    {
      if (Compare(m_items[p], m_items[c]) < 0)
      {
        break;
      }
      Swap(m_items, c, p);
    }
  }

  private void FixDown(int i)
  {
    // Make sure that starting with i-th node down to the leaf 
    // the tree satisfies the heap property: if B is a child 
    // node of A, then key(A) ≤ key(B)
    for (int p = i, c = FirstChild(p); c < m_count; p = c, c = FirstChild(c))
    {
      if (c + 1 < m_count && Compare(m_items[c + 1], m_items[c]) < 0)
      {
        c++;
      }
      if (Compare(m_items[p], m_items[c]) < 0)
      {
        break;
      }
      Swap(m_items, p, c);
    }
  }

  private static int Parent(int i)
  {
    return (i - 1)/2;
  }

  private static int FirstChild(int i)
  {
    return i*2 + 1;
  }

  private int Compare(T a, T b)
  {
    return m_comparer.Compare(a, b);
  }

  private void Expand(int capacity)
  {
    Array.Resize(ref m_items, capacity);
  }

  private static void Swap(T[] arr, int i, int j)
  {
    var t = arr[i];
    arr[i] = arr[j];
    arr[j] = t;
  }
}

Example below prints top 200 elements from sequence of mscorlib types ordered by full name (sorting it first and than taking first 200 elements is less efficient).

class TypeNameComparer : Comparer<Type>
{
  public override int Compare(Type x, Type y)
  {
    Contract.Requires(x != null);
    Contract.Requires(y != null);

    return x.FullName.CompareTo(y.FullName);
  }
}

...

const int count = 200;
var types = typeof (object).Assembly.GetTypes();
var typesQueue = new PriorityQueue<Type>(types, new TypeNameComparer());

for (int i = 0; i < count && typesQueue.Count > 0; i++)
{
  Console.WriteLine(typesQueue.Dequeue());
}

That’s it.

No comments: