Journal Articles
Browse in : |
All
> Journals
> CVu
> 281
(11)
All > Topics > Programming (877) Any of these categories - All of these categories |
Note: when you create a new publication type, the articles module will automatically use the templates user-display-[publicationtype].xt and user-summary-[publicationtype].xt. If those templates do not exist when you try to preview or display a new article, you'll get this warning :-) Please place your own templates in themes/yourtheme/modules/articles . The templates will get the extension .xt there.
Title: Sliding Window Filters : A Set-based Implementation
Author: Martin Moene
Date: 11 March 2016 14:37:22 +00:00 or Fri, 11 March 2016 14:37:22 +00:00
Summary: Omar Bashir presents an implementation using pre-sorted data to reduce CPU load.
Body:
Sliding window filters are commonly used for noise reduction and data smoothing in applications ranging from algorithmic trading in financial systems to signal processing in digital media systems. Efficient implementations of these filters optimise overall performance of systems in which they operate.
A set-based implementation of a sliding window filter is discussed which optimises the performance of the most common of sliding window filters, the median filter. This implementation can easily be extended to use measures other then median for filtering.
Motivation
A median filter operates on a window over the data stream. This window is implemented using a queue of a specified size. When the queue is full, an input causes the earliest value in the queue to be ejected and discarded. The output of the filter is the median of the values in the queue (see figure 1).
Figure 1 |
An obvious implementation of the Median Calculator in figure 1 sorts the data in the queue and then calculates the median from the sorted collection. An alternative and efficient implementation maintains a sorted collection which is updated with every new insert. If the queue is full, this update involves removing the earliest value from both the queue and the sorted collection and then adding the new value into both. Updating a sorted collection is generally a computationally less expensive operation than sorting the complete list (i.e., ln(n) vs. n ln(n)). If the sorted collection is index based, calculation of median from it is a constant complexity operation. Figure 2 shows operation of a median filter using a sorted collection.
Figure 2 |
Sorted collections
Bashir [1] demonstrated the versatility of sorted collections of numerical data in the context of performance monitoring of computer networks. Analysis of network latencies and investigation of related incidents requires iterative derivation and analysis of several summaries at different levels of temporal granularities. Processing these statistical summaries from raw data in every iteration can be computationally intensive.
Bashir proposes preprocessing raw data to intermediate information at the lowest level of granularity. This intermediate information should have properties of richness, efficiency and mergeability. Richness is the ability to derive most statistical summaries from intermediate information without accessing the raw data. Efficiency requires calculation of statistical summaries to be equally or less computationally complex than deriving them from raw data. Finally mergeability is the ability to combine appropriate intermediate information at finer granularities to provide intermediate information at coarser granularities. These can then be processed efficiently to derive required summaries at those granularities.
Sorted numerical data exhibit all these properties. In the context of filtering based on measures of central tendency, richness and efficiency are desired properties while mergeability is less relevant as these filters generally operate at fixed granularities.
Using sorted numerical data also allows decomposing such filters into a preprocessor and a summary calculator. The preprocessor queues the input and maintains a sorted copy of the queued data. The summary calculator produces the desired measure of central tendency from the sorted collection. Thus, the output of the summary calculator is the filtered value. Decoupling the two allows developing filters based on different measures of central tendency using the same preprocessor implementation.
However, such preprocessors need to maintain a sorted copy of the window in addition to the window itself. This can be an issue for larger window sizes in applications that need to have a smaller memory footprint.
A sliding window preprocessor implementation
A Java implementation of a sliding window preprocessor is described here. This preprocessor uses an instance of the ArrayBlockingQueue
, an implementation of the Queue
interface, as the sliding window. A TreeSet
instance is used as the sorted collection of elements in the sliding window.
A TreeSet
does not allow recurring values while a sliding window algorithm accepts recurring values. To allow recurring values in a TreeSet
, each value is encapsulated with a monotonically increasing integer that is incremented for every new value. OrderedElement
class (Listing 1) encapsulates the order of arrival with the arriving value. OrderedElement
’s implementation of the Comparable
interface compares the order of arrival for recurring values. This allows recurring values encapsulated in an OrderedElement
instance to exist in a TreeSet
in sorted order.
public class OrderedElement<T extends Comparable<T>> implements Comparable<OrderedElement<T>>{ private Integer order; private T value; public OrderedElement(Integer order, T value) { this.order = order; this.value = value; } public Integer getOrder() { return order; } public T getValue() { return value; } @Override public int compareTo(OrderedElement<T> o) { int valueComp = value.compareTo(o.value); return (valueComp != 0) ? valueComp : order.compareTo(o.order); } @Override public boolean equals(Object o) { return (null != o) && (o instanceof OrderedElement<?>) && ((this == o) || (this.compareTo((OrderedElement<T>) o) == 0)); } ... } |
Listing 1 |
Listing 2 shows the preprocessor, SetOrderedWindow
. An instance of this class uses queue
, an ArrayBlockingQueue
instance, and sorter
, a TreeSet
instance. SetOrderedWindow
maintains an integer, sequence
, which is incremented every time the insert
method is called to insert a value into the window.
public class SetOrderedWindow<T extends Comparable<T>> { private ArrayBlockingQueue<OrderedElement<T>> queue; private Set<OrderedElement<T>> sorter; private int sequence; public SetOrderedWindow(int size) { this.sequence = 0; this.queue = new ArrayBlockingQueue<OrderedElement<T>>(size); this.sorter = new TreeSet<OrderedElement<T>>(); } public int getCurrentSize() { return queue.size(); } public int getCapacity() { return queue.size() + queue.remainingCapacity(); } public void insert(T value) throws InterruptedException { OrderedElement<T> element = new OrderedElement<>(sequence, value); sequence++; if (queue.remainingCapacity() == 0) { OrderedElement<T> elementToRemove = queue.poll(); sorter.remove(elementToRemove); } queue.put(element); sorter.add(element); } public ArrayList<T> getElementsSorted() { ArrayList<T> reply = new ArrayList<T>(); for (OrderedElement<T> element : sorter) { reply.add(element.getValue()); } return reply; } public List<T> getElements() { List<T> reply = new ArrayList<T>(); for (OrderedElement<T> element : queue) { reply.add(element.getValue()); } return reply; } } |
Listing 2 |
If there is still capacity in the window, the input value is encapsulated with the sequence number in an instance of OrderedElement
and inserted into both queue
and sorter
. If there is no capacity in the window, SetOrderedWindow
removes the element at the head of the queue
and also removes the same element from sorter
before inserting the new element to both queue
and sorter
(Figure 3).
Figure 3 |
The SetOrderedWindow
class contains two methods to retrieve the data its instance contains. The getElements
method returns a List
of all the values contained in queue
in the order of arrival. getElementsSorted
method returns the values contained in sorter
in sorted order in an array which allows efficient index based access to the values. getElementsSorted
is the most commonly used method for data access as the array it returns can be used efficiently to derive any measure of central tendency for filtering.
Performance measurements
The performance of the SetOrderedWindow
class has been measured for different window sizes from 500 to 2500 elements.
The performance of SetOrderedWindow
is compared with the performance of sorting the data in queue
using an ArrayList
for every summary query. These test applications were compiled using JDK 8 and executed over Lubuntu 14.04 on a 1.6 GHz Intel Atom N270 processor with 1 GB RAM.
Figure 4 compares the performance when the windows are being filled from the beginning and they are not yet full. This means the elements are only being inserted in the queue
objects. In the SetOrderedWindow
, these elements are also being inserted in sorter
to keep the data in sorted order. For each window size, the time measured is the time required to fill the entire window and for each entry calculating the median of the data in the window.
Figure 4 |
Figure 5 compares the performance when the windows are already full. Inserting an element in the windows requires removing the element at the head of queue
objects before inserting the new element at the tail of queue
objects. In the SetOrderedWindow
, the element at the queue
head is also removed from sorter
and the element being inserted at the tail of queue
is also inserted into sorter
. For each window size, the time measured is the time required to replace the contents of the entire window and for each new entry calculating the median of the data in the window.
Figure 5 |
These results show that the sliding window filter using a set based sorter performs better than the filter that sorts window data for every query. Moreover, relative performance of the set based sliding window filter improves for larger window sizes.
Most applications will have window sizes less than 100 elements. The performance difference between the two approaches may not be significant for an individual insert and summary calculation. But as these applications are long running, any additional performance impact reduces the overall throughput. However, a sliding window implementation using a set based sorter consumes more memory to maintain the data it needs to process. Depending upon the specifics of the implementation, it can be twice as much memory. As long as this overhead is acceptable, such an implementation can improve overall application throughput.
Conclusions
Sliding window filters summarise data captured in a window sliding over the input data stream and the summarised value at the output is the filtered value. These filters are used for noise reduction and data smoothing in applications from diverse domains such as financial computations, signal processing and automatic systems control.
A common sliding window filter is the median filter where the filter calculates median over the window and uses that as the filtered output. Conventional median filter implementations sort the data in the window for each insert. As sorting is computationally expensive, this can reduce throughput of host systems.
An alternative implementation discussed here uses a set to keep a copy of the data in the sliding window in sorted order. It is shown here that maintenance of this set is computationally less expensive than resorting the window as it slides over the input data. This, however, can have the cost of doubling the memory footprint of the filter because the filter needs to maintain the values in the window in arrival and in sorted order.
Maintaining sorted data in the filter allows calculating a range of summaries efficiently. Furthermore, actual summary calculation can be decoupled from maintaining the window in arrival and sorted orders inside a preprocessor. Thus, the same preprocessor can then be used with filters that use measures of central tendency other than the median.
While this article discusses implementing a 1-D sliding window filter, an adaptation can be used to implement 2-D sliding window filters. These can be used to optimise image processing applications.
References
[1] Bashir, O. (1998), Management and Processing of Network Performance Information, PhD Thesis, Loughborough University
Notes:
More fields may be available via dynamicdata ..