Matt Bannert
Matt Bannert

Reputation: 28274

How to select standard deviation within a row? (in SQL - or R :)

I wonder whether there is a way to select the standard deviation from several integer fields in MySQL within the same row. Obviously, if I use

SELECT STDDEV(col1) FROM mytable

I just get the standard deviation of that particular column. Let´s assume I have a table like: id,somefield1,somefield2, integerfield1,integerfield2,integerfield3, ... ,integerfield30 . Now I´d like to select the standard deviation of integerfield 1-30 within a row and save it AS sdfield . Of course I could use statistical software for this, but I just wonder if there is a way to do it directly in MySQL.

Upvotes: 4

Views: 6783

Answers (4)

Brandon Bertelsen
Brandon Bertelsen

Reputation: 44658

With R:

df <- your.pull
sd(t(df[sapply(df, is.numeric)]))

Pull data with RMySQL or RODBC, remove non numeric columns, transpose and use sd.

Upvotes: 3

Matt Bannert
Matt Bannert

Reputation: 28274

I found two solutions on my own:

1) Normalize the database. I end up with two tables:

table one uid | information1 | metainformation2

table two uid | col | result_of_col

Then I can easily use the standard STDDEV function.

2) Use R. The data is a de-normalized format because it should be used in statistical analysis. Thus it´s easy to get into R and use the following code.

sd(t(dataset[1:4,3:8]))

Note that, I just take the numeric part of this data.frame by leaving selecting the columns 3-8. And dont get hit by too much data (that´s why I only use the first couple of rows this time). t() transposes the data which is necessary because sd() only works with columns.

There´s a function rowSds around in the vsn package, that is supposed to work analogously to rowMean and rowSum, but somehow this might be deprecated. At least this packages was not available on the Swiss CRAN mirror ;) .

HTH someone else.

Upvotes: 1

Spudley
Spudley

Reputation: 168725

Have you tried using UNION to effectively put all your column values into separate rows? Something like this, maybe:

SELECT STDDEV(allcols)
FROM (
    SELECT col1 FROM table WHERE id=requiredID
    UNION
    SELECT col2 FROM table WHERE id=requiredID
    UNION
    SELECT col3 FROM table WHERE id=requiredID
    UNION
    SELECT col4 FROM table WHERE id=requiredID
    UNION
    SELECT col5 FROM table WHERE id=requiredID
)

Upvotes: 3

J-16 SDiZ
J-16 SDiZ

Reputation: 26910

for simplicity, assume you have n columns, named A, B, C .... :

SELECT SQRT(  
  (A*A + B*B + C*C + ...)/n  - (A+B+C+...)*(A+B+C+...)/n/n) AS sd
  FROM table;

Upvotes: 3

Related Questions