    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: Sliding Window Filters : A Set-based Implementation</title>
        <link>https://members.accu.org/index.php/articles/2207</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Programming Topics + CVu Journal Vol 28, #1 - March 2016</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c65/">Programming</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c359/">281</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c65-359/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c65+359/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;Sliding Window Filters : A Set-based Implementation</h1>
<p><strong>Author:</strong>&nbsp;Martin Moene</p>
<p>
<strong>Date:</strong> 11 March 2016 14:37:22 +00:00 or Fri, 11 March 2016 14:37:22 +00:00</p>
<p><strong>Summary:</strong>&nbsp;Omar Bashir presents an implementation using pre-sorted data to reduce CPU load.</p>
<p><strong>Body:</strong>&nbsp;<p>Sliding window filters are commonly used for noise reduction and data smoothing in applications ranging from algorithmic trading in financial systems to signal processing in digital media systems. Efficient implementations of these filters optimise overall performance of systems in which they operate.</p>

<p>A set-based implementation of a sliding window filter is discussed which optimises the performance of the most common of sliding window filters, the median filter. This implementation can easily be extended to use measures other then median for filtering.</p>

<h2>Motivation</h2>

<p>A median filter operates on a window over the data stream. This window is implemented using a queue of a specified size. When the queue is full, an input causes the earliest value in the queue to be ejected and discarded. The output of the filter is the median of the values in the queue (see figure 1).</p>

<table class="sidebartable">
	<tr>
		<td><img src="http://accu.org/content/images/journals/cvu28-1/Bashir/Bashir-01.png" /></td>
	</tr>
	<tr>
		<td class="title">Figure 1</td>
	</tr>
</table>

<p>An obvious implementation of the Median Calculator in figure 1 sorts the data in the queue and then calculates the median from the sorted collection. An alternative and efficient implementation maintains a sorted collection which is updated with every new insert. If the queue is full, this update involves removing the earliest value from both the queue and the sorted collection and then adding the new value into both. Updating a sorted collection is generally a computationally less expensive operation than sorting the complete list (i.e., ln(<em>n</em>) vs. <em>n</em> ln(<em>n</em>)). If the sorted collection is index based, calculation of median from it is a constant complexity operation. Figure 2 shows operation of a median filter using a sorted collection.</p>

<table class="sidebartable">
	<tr>
		<td><img src="http://accu.org/content/images/journals/cvu28-1/Bashir/Bashir-02.png" /></td>
	</tr>
	<tr>
		<td class="title">Figure 2</td>
	</tr>
</table>

<h2>Sorted collections</h2>

<p>Bashir <a href="#[1]">[1]</a> demonstrated the versatility of sorted collections of numerical data in the context of performance monitoring of computer networks. Analysis of network latencies and investigation of related incidents requires iterative derivation and analysis of several summaries at different levels of temporal granularities. Processing these statistical summaries from raw data in every iteration can be computationally intensive.</p>

<p>Bashir proposes preprocessing raw data to intermediate information at the lowest level of granularity. This intermediate information should have properties of richness, efficiency and mergeability. Richness is the ability to derive most statistical summaries from intermediate information without accessing the raw data. Efficiency requires calculation of statistical summaries to be equally or less computationally complex than deriving them from raw data. Finally mergeability is the ability to combine appropriate intermediate information at finer granularities to provide intermediate information at coarser granularities. These can then be processed efficiently to derive required summaries at those granularities.</p>

<p>Sorted numerical data exhibit all these properties. In the context of filtering based on measures of central tendency, richness and efficiency are desired properties while mergeability is less relevant as these filters generally operate at fixed granularities. </p>

<p>Using sorted numerical data also allows decomposing such filters into a preprocessor and a summary calculator. The preprocessor queues the input and maintains a sorted copy of the queued data. The summary calculator produces the desired measure of central tendency from the sorted collection. Thus, the output of the summary calculator is the filtered value. Decoupling the two allows developing filters based on different measures of central tendency using the same preprocessor implementation.</p>

<p>However, such preprocessors need to maintain a sorted copy of the window in addition to the window itself. This can be an issue for larger window sizes in applications that need to have a smaller memory footprint.</p>

<h2>A sliding window preprocessor implementation</h2>

<p>A Java implementation of a sliding window preprocessor is described here. This preprocessor uses an instance of the <code>ArrayBlockingQueue</code>, an implementation of the <code>Queue</code> interface, as the sliding window. A <code>TreeSet</code> instance is used as the sorted collection of elements in the sliding window. </p>

<p>A <code>TreeSet</code> does not allow recurring values while a sliding window algorithm accepts recurring values. To allow recurring values in a <code>TreeSet</code>, each value is encapsulated with a monotonically increasing integer that is incremented for every new value. <code>OrderedElement</code> class (Listing 1) encapsulates the order of arrival with the arriving value.  <code>OrderedElement</code>â€™s implementation of the <code>Comparable</code> interface compares the order of arrival for recurring values. This allows recurring values encapsulated in an <code>OrderedElement</code> instance to exist in a <code>TreeSet</code> in sorted order.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
public class OrderedElement&lt;T extends Comparable&lt;T&gt;&gt; implements Comparable&lt;OrderedElement&lt;T&gt;&gt;{ 

  private Integer order; 
  private T value; 

  public OrderedElement(Integer order, T value) { 
    this.order = order; 
    this.value = value; 
  }

  public Integer getOrder() { 
    return order; 
  }

  public T getValue() { 
    return value; 
  } 

  @Override 
  public int compareTo(OrderedElement&lt;T&gt; o) {
    int valueComp = value.compareTo(o.value);
    return (valueComp != 0) ? valueComp 
      : order.compareTo(o.order); 
  } 
  @Override 
  public boolean equals(Object o) { 
    return (null != o) &amp;&amp; 
      (o instanceof OrderedElement&lt;?&gt;) &amp;&amp; 
      ((this == o) ||
      (this.compareTo((OrderedElement&lt;T&gt;) o) 
       == 0)); 
  }
 ...
}
</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 1</td>
	</tr>
</table>

<p>Listing 2 shows the preprocessor, <code>SetOrderedWindow</code>. An instance of this class uses <code>queue</code>, an <code>ArrayBlockingQueue</code> instance, and <code>sorter</code>, a <code>TreeSet</code> instance. <code>SetOrderedWindow</code> maintains an integer, <code>sequence</code>, which is incremented every time the <code>insert</code> method is called to insert a value into the window.</p>

<table class="sidebartable">
	<tr>
		<td>
			<pre class="programlisting">
public class 
    SetOrderedWindow&lt;T extends Comparable&lt;T&gt;&gt; {
  private
    ArrayBlockingQueue&lt;OrderedElement&lt;T&gt;&gt; queue; 
  private Set&lt;OrderedElement&lt;T&gt;&gt; sorter;
  private int sequence;
  public SetOrderedWindow(int size) {
    this.sequence = 0;
    this.queue = new
     ArrayBlockingQueue&lt;OrderedElement&lt;T&gt;&gt;(size);
    this.sorter = new
     TreeSet&lt;OrderedElement&lt;T&gt;&gt;();
  }
  public int getCurrentSize() {
    return queue.size();
  }
  public int getCapacity() {
    return queue.size() +
      queue.remainingCapacity();
  }
  public void insert(T value) 
      throws InterruptedException {
    OrderedElement&lt;T&gt; element = 
      new OrderedElement&lt;&gt;(sequence, value);
    sequence++;
    if (queue.remainingCapacity() == 0) {
      OrderedElement&lt;T&gt; elementToRemove 
        = queue.poll();
      sorter.remove(elementToRemove);
    }
    queue.put(element);
    sorter.add(element);
  }
  public ArrayList&lt;T&gt; getElementsSorted() {
    ArrayList&lt;T&gt; reply = new ArrayList&lt;T&gt;();
    for (OrderedElement&lt;T&gt; element : sorter) {
      reply.add(element.getValue());
    }
    return reply;
  }
  public List&lt;T&gt; getElements() {
    List&lt;T&gt; reply = new ArrayList&lt;T&gt;();
    for (OrderedElement&lt;T&gt; element : queue) {
      reply.add(element.getValue());
    }
    return reply;
  }
}
			</pre>
		</td>
	</tr>
	<tr>
		<td class="title">Listing 2</td>
	</tr>
</table>

<p> If there is still capacity in the window, the input value is encapsulated with the sequence number in an instance of <code>OrderedElement</code> and inserted into both <code>queue</code> and <code>sorter</code>. If there is no capacity in the window, <code>SetOrderedWindow</code> removes the element at the head of the <code>queue</code> and also removes the same element from <code>sorter</code> before inserting the new element to both <code>queue</code> and <code>sorter</code> (Figure 3).</p>

<table class="sidebartable">
	<tr>
		<td><img src="http://accu.org/content/images/journals/cvu28-1/Bashir/Bashir-03.png" /></td>
	</tr>
	<tr>
		<td class="title">Figure 3</td>
	</tr>
</table>

<p>The <code>SetOrderedWindow</code> class contains two methods to retrieve the data its instance contains. The <code>getElements</code> method returns a <code>List</code> of all the values contained in <code>queue</code> in the order of arrival. <code>getElementsSorted</code> method returns the values contained in <code>sorter</code> in sorted order in an array which allows efficient index based access to the values. <code>getElementsSorted</code> is the most commonly used method for data access as the array it returns can be used efficiently to derive any measure of central tendency for filtering. </p>

<h2>Performance measurements</h2>

<p>The performance of the <code>SetOrderedWindow</code> class has been measured for different window sizes from 500 to 2500 elements.</p>

&lt;IMAGE xml:link=&quot;simple&quot; href=&quot;Bashir-3.gif&quot; show=&quot;embed&quot; actuate=&quot;auto&quot;/&gt;

<p>The performance of <code>SetOrderedWindow</code> is compared with the performance of sorting the data in <code>queue</code> using an <code>ArrayList</code> for every summary query. These test applications were compiled using JDK 8 and executed over Lubuntu 14.04 on a 1.6 GHz Intel Atom N270 processor with 1 GB RAM.</p>

<p>Figure 4 compares the performance when the windows are being filled from the beginning and they are not yet full. This means the elements are only being inserted in the <code>queue</code> objects. In the <code>SetOrderedWindow</code>, these elements are also being inserted in <code>sorter</code> to keep the data in sorted order. For each window size, the time measured is the time required to fill the entire window and for each entry calculating the median of the data in the window.</p>

<table class="sidebartable">
	<tr>
		<td><img src="http://accu.org/content/images/journals/cvu28-1/Bashir/Bashir-04.png" /></td>
	</tr>
	<tr>
		<td class="title">Figure 4</td>
	</tr>
</table>

<p>Figure 5 compares the performance when the windows are already full. Inserting an element in the windows requires removing the element at the head of <code>queue</code> objects before inserting the new element at the tail of <code>queue</code> objects. In the <code>SetOrderedWindow</code>, the element at the <code>queue</code> head is also removed from <code>sorter</code> and the element being inserted at the tail of <code>queue</code> is also inserted into <code>sorter</code>. For each window size, the time measured is the time required to replace the contents of the entire window and for each new entry calculating the median of the data in the window.</p>

<table class="sidebartable">
	<tr>
		<td><img src="http://accu.org/content/images/journals/cvu28-1/Bashir/Bashir-05.png" /></td>
	</tr>
	<tr>
		<td class="title">Figure 5</td>
	</tr>
</table>

<p>These results show that the sliding window filter using a set based sorter performs better than the filter that sorts window data for every query. Moreover, relative performance of the set based sliding window filter improves for larger window sizes.</p>

<p>Most applications will have window sizes less than 100 elements. The performance difference between the two approaches may not be significant for an individual insert and summary calculation. But as these applications are long running, any additional performance impact reduces the overall throughput. However, a sliding window implementation using a set based sorter consumes more memory to maintain the data it needs to process. Depending upon the specifics of the implementation, it can be twice as much memory. As long as this overhead is acceptable, such an implementation can improve overall application throughput.</p>

<h2>Conclusions</h2>

<p>Sliding window filters summarise data captured in a window sliding over the input data stream and the summarised value at the output is the filtered value. These filters are used for noise reduction and data smoothing in applications from diverse domains such as financial computations, signal processing and automatic systems control.</p>

<p>A common sliding window filter is the median filter where the filter calculates median over the window and uses that as the filtered output. Conventional median filter implementations sort the data in the window for each insert. As sorting is computationally expensive, this can reduce throughput of host systems.</p>

<p>An alternative implementation discussed here uses a set to keep a copy of the data in the sliding window in sorted order. It is shown here that maintenance of this set is computationally less expensive than resorting the window as it slides over the input data. This, however, can have the cost of doubling the memory footprint of the filter because the filter needs to maintain the values in the window in arrival and in sorted order. </p>

<p>Maintaining sorted data in the filter allows calculating a range of summaries efficiently. Furthermore, actual summary calculation can be decoupled from maintaining the window in arrival and sorted orders inside a preprocessor. Thus, the same preprocessor can then be used with filters that use measures of central tendency other than the median.</p>

<p>While this article discusses implementing a 1-D sliding window filter, an adaptation can be used to implement 2-D sliding window filters. These can be used to optimise image processing applications.</p>

<h2>References</h2>

<p class="bibliomixed"><a id="[1]"></a>[1]	Bashir, O. (1998), Management and Processing of Network Performance Information, PhD Thesis, Loughborough University</p>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
