John F. N. Salik

Peak Finding in Noisy Data using Windowed Curve-Fitting – Part One

In Mathematics on November 1, 2013 at 8:10 pm

This is the first of a two-part article that focuses on a method for detecting peaks in noisy data.  Part One will discuss the signal model as well as the least-means squares method of quadratic curve fitting.  Part Two will discuss the detection criteria from the quadric coefficients, general applications for peak-finding as well as short videos of this particular algorithm working on synthesized test data.  This sort of algorithm is not unique, and has many applications in engineering and science.

Parabola-Fitting to Data with a Noise Component

Experimental data can be particularly noisy in some cases and peak finding therefore, must exclude any slope detection methods. Instead, a local quadratic curve fit can be used to estimate the location of a peak in noisy data. This method defines the parameters of a quadratic that best represents local data while reducing the effect of the noise contribution. Consider the case where we have i=1 \ldots N data points (x_i,y_i). Assuming that x_i is an independent variable, and y_i has an additive noise component. This is to say, for each data point we have:

 y_i= f(x_i,a_1,a_2,\ldots,a_k) + n_i

Where n_i is a noise sample superimposed over an underlying model f(\cdot) (with parameters a_1,a_2,\ldots,a_k), and y_i is the resulting noisy sample. For the purposes of peak finding, we will define f(\cdot) as a second-order polynomial:

f(x_i,a_1,a_2,a_3) = a_1 + a_2 x_i + a_3 x_i^2

A system of equations summarizes the problem so that a_1, a_2, and a_3 can be optimally determined:

\displaystyle\left[\begin{array}{ccc}1 & x_1 & x_1^2\\1 & x_2 & x_2^2\\\vdots & \vdots & \vdots\\1 & x_n & x_n^2\\\end{array}\right]\left[\begin{array}{c}a_1\\a_2\\a_3\\\end{array}\right] =\left[\begin{array}{c}y_1\\y_2\\y_3\\\vdots\\y_n\\\end{array}\right]

We can express this as \mathbf{X} \mathbf{a} = \mathbf{y} and find either an exact solution for n=3 or an approximation for n>3, an overdetermined system. This is to say, we wish to find \min\limits_{\mathbf{a}}\left\| \mathbf{X} \mathbf{a} - \mathbf{y}\right\|. Some numeric solvers such as Matlab or Octave can provide a direct mechanism to determine an optimal \mathbf{a} in the least-squares sense (see the \textbf{\textbackslash} operator in the Matlab/Octave documentation. Use: a=X \backslash y). In the absence of these tools, the following relationship will provide a least-squares solution:

\mathbf{a} = (\mathbf{X X^T})^{-1} \mathbf{X^T y}

Numerical solvers or computer libraries that have efficient matrix inversion methods can benefit from this method.

Another method for obtaining \mathbf{a}, is to attempt to minimize the squared error directly by first minimizing the error between the model and the observed data.  The sum of the errors squared for a data set of n data points is:

\displaystyle\epsilon^2= \sum_{i=1}^n[a_1 x_i^2 +a_2 x_i+a_3-y_i]^2

As a condition for \epsilon^2 to be minimum, the following quantities must be simultaneously minimized, one for each of the model parameters:

\displaystyle\frac{\partial\epsilon^2}{\partial a_1}=0 \displaystyle\frac{\partial\epsilon^2}{\partial a_2}=0, \displaystyle\frac{\partial\epsilon^2}{\partial a_3}=0

Evaluated explicitly, we have a system of three equations in three unknowns:

\displaystyle\frac{\partial\epsilon^2}{\partial a_1}=2 \sum_{i=1}^n(a_1 x_i^2 + a_2 x_i +a_3 -y_i) x_i^2 = 0
\displaystyle\frac{\partial\epsilon^2}{\partial a_2}=2 \sum_{i=1}^n(a_1 x_i^2 + a_2 x_i +a_3 -y_i) x_i = 0
\displaystyle\frac{\partial\epsilon^2}{\partial a_3}=2 a_3 n + 2 \sum_{i=1}^n(a_1 x_i^2 + a_i x_i - y_i) = 0

Applying Cramer’s rule after some manipulation, we can directly obtain a_1, a_2, and a_3:

\displaystyle a_1=\frac{-n s_{30} s_{11} +n s_{20}s_{21}+s_{10}s_{20}s_{11}-s_{21}s_{10^2}+s_{30}s_{10}s_{01}-s_{20}^2s_{01}}{n s_{40}s_{20}-ns_{30}^2-s_{20}^3+2 s_{30}s_{20}s_{10}-s_{40}s_{10}^2}

\displaystyle a_2=\frac{s_{30}s_{20}s_{01}-s_{20}^2s_{11}-n s_{30}s_{21}+n s_{40}s_{11} + s_{10}s_{20}s_{21}-s_{40}s_{10}s_{01}}{n s_{40}s_{20}-ns_{30}^2-s_{20}^3+2 s_{30}s_{20}s_{10}-s_{40}s_{10}^2}

\displaystyle a_3=\frac{s_{20}s_{30}s_{11}-s_{20}^2s_{21}+s_{10}s_{30}s_{21}-s_{10}s_{40}s_{11}+s_{01}s_{40}s_{20}-s_{01}s_{30}^2}{n s_{40}s_{20}-ns_{30}^2-s_{20}^3+2 s_{30}s_{20}s_{10}-s_{40}s_{10}^2}

Where the following substitutions are to be made:

\displaystyle s_{10} = \sum_{i=1}^n x_i, \displaystyle s_{20} = \sum_{i=1}^n x_i^2, \displaystyle s_{30} = \sum_{i=1}^n x_i^3

\displaystyle s_{40} = \sum_{i=1}^n x_i^4\displaystyle s_{11} = \sum_{i=1}^n x_i y_i, \displaystyle s_{21} = \sum_{i=1}^n x_i^2 y_i, \displaystyle s_{01} = \sum_{i=1}^n y_i

Here, the problem of curve-fitting is reduced to an algebraic operation. This solution is important because peak-finding in noisy data employs the use of curve-fitting within a sliding window of data. The previous method used matrix operations which is computationally efficient for small windows, but yields diminishing returns as the window gets larger. Working with sums is different because the sums can be computed once. As the window shifts, the sums are adjusted for the shift. This is to say, to each sum, a value is added to compensate for the new data point considered, and a value is subtracted for the value that is now out of scope. This is a particularly interesting method to employ on microprocessor systems as the computational complexity is low, and once the sums are computed, the shift has a constant complexity (\mathrm{O}(n)). Consider a sliding window of width n within a data set of length N. If we consider the first element in the window to be at position k, then we can express the sums as follows:

\displaystyle s_{10} = \sum_{i=k}^{n+k-1} x_i \displaystyle s_{20} = \sum_{i=k}^{n+k-1} x_i^2 \displaystyle s_{30} = \sum_{i=k}^{n+k-1} x_i^3 ,

\displaystyle s_{40} = \sum_{i=k}^{n+k-1} x_i^4 \displaystyle s_{11} = \sum_{i=k}^{n+k-1} x_i y_i \displaystyle s_{21} = \sum_{i=k}^{n+k-1} x_i^2 y_i \displaystyle s_{01} = \sum_{i=1}^n y_i

Initialization occurs at k=1 where all sums are computed. For k>2, we simply adjust the sums with respect to the shift in data:

s_{10} = s_{10} - x_{k-1} + x_{k+n}

s_{20} = s_{20} - x_{k-1}^2 + x_{k+n}^2

s_{30} = s_{30} - x_{k-1}^3 + x_{k+n}^3

s_{40} = s_{40} - x_{k-1}^4 + x_{k+n}^4

s_{11} = s_{11} - x_{k-1}y_{k-1} + x_{k+n}y_{k-n}

s_{21} = s_{21} - x_{k-1}^2y_{k-1} + x_{k+n}^2y_{k-n}

s_{01} = s_{01} - y_{k-1} + y_{k+n}

Once obtained, \mathbf{a} can be determined. This iteration should proceed for k=2\ldots N-n+1. The minimum window size is n=3, but should be selected to cover a range that encapsulates the minimum width of the feature we wish to detect.

In the next part of this article, the peak detection criteria will be discussed along with methods for resolving the detection of multiple peaks in close proximity.  Different types of peaks will be discussed as well as possible applications for this algorithm or others of this sort.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: