Reputation: 335
I have some results that are stored in a multidimensional array:
double[,] results;
Each column is a time series of prices for a specific variable (e.g. "house", "car", "electricity"). I would like to calculate some statistics for each variable so that to summarize the results in a more compact form. For example, I was looking at the percentile function in Math.Net.
I would like to calculate the 90th percentile of the prices for each column (so for each variable).
I am trying the following, since the function doesn't work on multidimensional array (so I cannot pass results[,] as argument for the percentile function):
for (int i = 0, i <= results.GetLength(2), i++)
{
myList.Add(MathNet.Numerics.Statistics.Statistics.Percentile(results[,i], 90));
}
So I want to loop through the columns of my results[,] and calculate the 90th percentile, adding the result to a list. But this doesn't work because of wrong syntax in results[, i]. There is no other (more clear) error message unfortunately.
Can you help me understand where the problem is and if there's a better way to calculate a percentile by column?
Upvotes: 1
Views: 2724
Reputation: 116526
Percentile is an extension method with following calling sequence:
public static double Percentile(this IEnumerable<double> data, int p)
So you can use Linq to transform your 2d array into an appropriate sequence to pass to Percentile
.
However, results.GetLength(2)
will throw an exception because the dimension argument of GetLength()
is zero-based. You probably meant results.GetLength(1)
. Assuming that's what you meant, you can do:
var query = Enumerable.Range(0, results.GetLength(1))
.Select(iCol => Enumerable.Range(0, results.GetLength(0))
.Select(iRow => results[iRow, iCol])
.Percentile(90));
You can have Linq make the list for you,
var myList= query.ToList();
or add it to a pre-existing list:
myList.AddRange(query);
update
To filter NaN
values use double.IsNaN
:
var query = Enumerable.Range(0, results.GetLength(1))
.Select(iCol => Enumerable.Range(0, results.GetLength(0))
.Select(iRow => results[iRow, iCol])
.Where(d => !double.IsNaN(d))
.Percentile(90));
update
If one extracts a couple of array extensions:
public static class ArrayExtensions
{
public static IEnumerable<IEnumerable<T>> Columns<T>(this T[,] array)
{
if (array == null)
throw new ArgumentNullException();
return Enumerable.Range(0, array.GetLength(1))
.Select(iCol => Enumerable.Range(0, array.GetLength(0))
.Select(iRow => array[iRow, iCol]));
}
public static IEnumerable<IEnumerable<T>> Rows<T>(this T[,] array)
{
if (array == null)
throw new ArgumentNullException();
return Enumerable.Range(0, array.GetLength(0))
.Select(iRow => Enumerable.Range(0, array.GetLength(1))
.Select(iCol => array[iRow, iCol]));
}
}
Them the query becomes:
var query = results.Columns().Select(col => col.Where(d => !double.IsNaN(d)).Percentile(90));
which seems much clearer.
Upvotes: 2