Reputation: 2158
I am currently using Fit.MultiDim from Math.NET to calculate an estimated value for a data set that has 2 dimensions like so:
using MathNet.Numerics;
using System.Collections.Generic;
using System.Linq;
namespace Project
{
public class Program
{
public static void Main(string[] args)
{
var xPoints = new List<List<double>>
{
new List<double> {2000, 100},
new List<double> {2002, 60},
new List<double> {2004, 50},
new List<double> {2006, 30},
};
var yPoints = new List<double> { 50, 60, 70, 80 };
var multiFit = Fit.MultiDim(
xPoints.Select(item => item.ToArray()).ToArray(),
yPoints.ToArray(),
intercept: true
);
// multiFit = [-9949.999999999998,4.999999999999999,0.0]
var inputDimension1 = 2003;
var inputDimension2 = 55;
var expectedY = multiFit[0] + inputDimension1 * multiFit[1] + inputDimension2 * multiFit[2];
// expectedY = 65
}
}
}
How do I update the logic to be able to specify which percentile I want to calculate the value for? Let's say I want to get a value for 25% and 75%.
I know that the library has Percentile
and Quantile
methods but I don't have any knowledge in statistics so have no idea how to apply it to my use case.
Upvotes: 1
Views: 221
Reputation: 41
I can't comment yet, so this isn't an answer for how to implement this in C#, but I have some advice.
What you're looking for is Quantile Regression. Quantiles are another way of saying percentiles. However, Math.NET currently doesn't have an implementation of this.
If you're not glued to C#, Roger Koenker (the person who developed the statistical method) has made an R package called "quantreg" that you can use. Additionally, he wrote a book about the subject which is freely available here. Fair warning, though, the book is pretty math-heavy, so I'd recommend just switching over to R if you really need to perform this quantile/percentile analysis (instead of the alternative, which is implementing it yourself).
A quick example using some R code and the "diamonds" dataset:
#install.packages("quantreg") # Run this line first then comment back out
#install.packages("tidyverse") # Run this line first then comment back out
library(quantreg)
library(tidyverse)
# A normal linear regression: predict the carat of the diamond
# using the depth, table, and price variables with no interaction effects
lm_diamond_model <- lm(carat ~ depth + table + price, diamonds)
summary(lm_diamond_model) # view the resuting details
# The same problem, except now we model the 50th percentile/quantile
median_reg_diamond_model <- rq(carat ~ depth + table + price,
tau = 0.5,
diamonds)
summary(median_reg_diamond_model) # view the resuting details
# The same problem, except now we model the 10th, 20th, ... , 90th percentiles
range_diamond_models <- rq(carat ~ depth + table + price,
tau = seq(from = 0.1, to = 0.9, by = 0.1),
diamonds)
summary(range_diamond_models) # view the resuting details
Upvotes: 0