F.P
F.P

Reputation: 17831

Create (mathematical) function from set of predefined values

I want to create an excel table that will help me when estimating implementation times for tasks that I am given. To do so, I derived 4 categories in which I individually rate the task from 1 to 10.

Those are: Complexity of system (simple scripts or entire business systems), State of requirements (well defined or very soft), Knowledge about system (how much I know about the system and the code base) and Plan for implementation (do I know what to do or don't I have any plan what to do or where to start).

After rating each task in these categories, I want to have a resulting factor of how expensive and how long the task will likely take, as a very rough estimate that I can tell my bosses.


What I thought about doing

I thought to create a function where I define the inputs and then get the result in form of a number, see:

|  a  |  b  |  c  |  d  | Result |
|  1  |  1  |  1  |  1  |  160   |
|  5  |  5  |  5  |  5  |  80    |
|  10 |  10 |  10 |  10 |  2     |

And I want to create a function that, when given a, b, c, d will produce the results above for the extreme cases (max, min, avg) and of course any values (float) in between.

How can I go about doing this? I imagine this is some form of polynomial problem, but how can I actually create the function that creates these results?

I have tasks like this often, so it would be cool to have a sort of pattern to follow whenever I need to create such functions for any amount of parameters and results needed.

I tried using wolfram alphas interpolate polynomial command for this, but the result is just a mess of extremely large fractions...

How can I create this function properly with reasonable results?


While writing this edit, I realize this may be better suited over at programmers.SE - If no one answers here, I will move the question there.

Upvotes: 0

Views: 570

Answers (1)

MvG
MvG

Reputation: 60908

You don't have enough data as it is. The simplest formula which takes into account all your four explanatory variables would be linear:

x0 + x1*a + x2*b + x3*c + x4*d

If you formulate a set of equations for this, you have three equations but five unknowns, which means that you don't have a unique solution. On the other hand, the data points which you did provide are proof of the fact that the relation between scores and time is not exactly linear. So you might have to look at some family of functions which is even more complex, and therefore has even more parameters to tune. While it would be easy to tune parameters to match the input, that choice would be pretty arbitrary, and therefore without predictive power.

So while your system of four distinct scores might be useful in the long run, I'd not use that at the moment. I'd suggest you collect some more data points, see how long a given task actually did take you, and only use that fine-grained a model once you have enough data points to fit all of its parameters.

In the meantime, aggregate all four numbers into a single number. E.g. by taking their average. Then decide on a formula to choose. E.g. a quadratic one:

182 - 22.9*a + 0.49*a*a

That's a fair fit for your requirements, and not too complex or messy. But the choice of function, i.e. a polynomial one, is still pretty arbitrary. So revisit that choice once you have more data. Note that this polynomial is almost the one Wolfram Alpha found for your data:

1642/9 - 344/15*a + 22/45*a*a

I only converted these rational numbers to decimal notation, which I truncated pretty early on since all of this is very rough in any case.

On the whole, this question appears more suited to CrossValidated than to Programmers SE, in my opinion. But don't bother them unless you have sufficient data to actually fit a model.

Upvotes: 1

Related Questions