Keno
Keno

Reputation: 143

Solving simultaneous equations (>2) without conversion to matrices in R or Python

I have a set of 4 simultaneous equations:

0.059z = x
0.06w = y
z+w = 8093
x+y = 422

All the solutions I've found so far seem to be for equations that have all the variables present in each equation, then convert to matrices and use the solve function.

Is there an easier way to do this in R or Python using the equations in their original form?

Also, how can I ensure that only positive numbers are returned in the solution?

Hope this makes sense...many thanks for the help

Upvotes: 0

Views: 220

Answers (3)

Sam Mason
Sam Mason

Reputation: 16204

there are a few things going on here. first as CDJB notes: if there were any positive solutions then sympy would find them. I searched for those numbers and found this paper which suggests you should be using 7088 instead of 8093. we can do a quick sanity check:

def pct(value):
    return f"{value:.1%}"

print(pct(422 / 8093))  # ~5.2%
print(pct(422 / 7088))  # ~6.0%

confirming that you're going to struggle averaging ~5.9% and ~6.0% towards ~5.2%, and explaining the negative solutions in the other answers. further, these are presumably counts so all your variables also need to be whole numbers.

once this correct denominator is used, I'd comment that there are many solutions (11645 by my count) e.g:

cases = [1, 421]
pop = [17, 7071]

rates = [pct(c / p) for c, p in zip(cases, pop)]

gives the appropriate output, as does:

cases = [2, 420]
pop = [34, 7054]

this is because the data was rounded to two decimal places. you probably also don't want to use either of the above, they're just the first two valid solutions I got.

we can define a Python function to enumerate all solutions:

from math import floor, ceil

def solutions(pop, cases, rate1, rate2, err):
    target = (pct(rate1), pct(rate2))
    for pop1 in range(1, pop):
        pop2 = pop - pop1
        c1_lo = ceil(pop1 * (rate1 - err))
        c1_hi = floor(pop1 * (rate1 + err))

        for c1 in range(c1_lo, c1_hi+1):
            c2 = cases - c1
            if (pct(c1 / pop1), pct(c2 / pop2)) == target:
                yield c1, c2, pop1, pop2

all_sols = list(solutions(7088, 422, 0.059, 0.060, 0.0005))

which is where I got my count of 11645 above from.

not sure what to suggest with this, but you could maybe do a bootstrap to see how much your statistic varies with different solutions. another option would be to do a Bayesian analysis which would let you put priors over the population sizes and hence cut this down a lot.

Upvotes: 1

ThomasIsCoding
ThomasIsCoding

Reputation: 102379

In R, maybe you try it like below:

library(rootSolve)
library(zeallot)

model <- function(v){
  c(x,y,z,w) %<-% v
  return(c(0.059*z-x, 0.06*w-y, z+w-8093, x+y-422))
}
res <- multiroot(f = model, start = c(0,0,0,0))

then you can get the solution res as

> res
[1]   3751.22  -3329.22  63580.00 -55487.00

Upvotes: 2

CDJB
CDJB

Reputation: 14536

You can use sympy for this:

from sympy import symbols, linsolve, Eq
x,y,z,w = symbols('x y z w')
linsolve([Eq(0.059*z, x), Eq(0.06*w, y), Eq(z+w, 8093), Eq(x+y, 422)], (x, y, z, w))

Output:

enter image description here

Regarding your comments about negative values - there is only one solution to the system of equations, and it has negative values for y and w. If there was more than one solution, sympy would return them, and you could filter the solutions from there to only positive values.

Upvotes: 2

Related Questions