Jack
Jack

Reputation: 317

Averaging many curves with different x and y values

I have several curves that contain many data points. The x-axis is time and let's say I have n curves with data points corresponding to times on the x-axis.

Is there a way to get an "average" of the n curves, despite the fact that the data points are located at different x-points?

I was thinking maybe something like using a histogram to bin the values, but I am not sure which code to start with that could accomplish something like this.

Can Excel or MATLAB do this?

I would also like to plot the standard deviation of the averaged curve.

One concern is: The distribution amongst the x-values is not uniform. There are many more values closer to t=0, but at t=5 (for example), the frequency of data points is much less.

Another concern. What happens if two values fall within 1 bin? I assume I would need the average of these values before calculating the averaged curve.

I hope this conveys what I would like to do.

Any ideas on what code I could use (MATLAB, EXCEL etc) to accomplish my goal?

Upvotes: 2

Views: 7217

Answers (1)

b3.
b3.

Reputation: 7175

Since your series' are not uniformly distributed, interpolating prior to computing the mean is one way to avoid biasing towards times where you have more frequent samples. Note that by definition, interpolation will likely reduce the range of your values, i.e. the interpolated points aren't likely to fall exactly at the times of your measured points. This has a greater effect on the extreme statistics (e.g. 5th and 95th percentiles) rather than the mean. If you plan on going this route, you'll need the interp1 and mean functions

An alternative is to do a weighted mean. This way you avoid truncating the range of your measured values. Assuming x is a vector of measured values and t is a vector of measurement times in seconds from some reference time then you can compute the weighted mean by:

timeStep = diff(t);
weightedMean = timeStep .* x(1:end-1) / sum(timeStep);

As mentioned in the comments above, a sample of your data would help a lot in suggesting the appropriate method for calculating the "average".

Upvotes: 1

Related Questions