Reputation: 49
I have a data frame "Values" with entries as follows:
Test1 Test2 Test3
Power1 20 30 40
Power2 10 15 13
Max_Power1 30 50 90
Max_Power2 50 80 40
Voltage1 1.2 1.2 1.2
Voltage2 1.3 1.3 1.3
Current1 1.0 1.2 1.3
Current2 2.0 2.1 2.2
I need to create a new column that will contain the following:
Average value of row 1
Average value of row 2
Max value of row 3
Max value of row 4
Average value of row 5
Average value of row 6
Min value of row 7
Min value of row 8
Short of creating this as a separate data frame with one column, setting each row entry individually and then importing into original data frame I do not know of a way to do this with a single commmand.
Suggestions are appreciated.
Upvotes: 0
Views: 64
Reputation: 269461
Using test
given in the Note at the end create a vector of function names, fun
, and mapply
the each function to the the corresponding row:
avg <- mean
fun <- c("avg", "avg", "max", "max", "avg", "avg", "min", "min")
n <- nrow(test)
rows <- split(as.matrix(test), 1:n)
transform(test, Value = mapply(function(fun, x) match.fun(fun)(x), fun, rows), Stat = fun)
giving:
Test1 Test2 Test3 Value Stat
Power1 20.0 30.0 40.0 30.00000 avg
Power2 10.0 15.0 13.0 12.66667 avg
Max_Power1 30.0 50.0 90.0 90.00000 max
Max_Power2 50.0 80.0 40.0 80.00000 max
Voltage1 1.2 1.2 1.2 1.20000 avg
Voltage2 1.3 1.3 1.3 1.30000 avg
Current1 1.0 1.2 1.3 1.00000 min
Current2 2.0 2.1 2.2 2.00000 min
Perhaps another possibility is to simply calculate the min, mean and max of every row.
stats <- function(x) c(Min = min(x), Avg = mean(x), Max = max(x))
cbind(test, t(apply(test, 1, stats)))
giving:
Test1 Test2 Test3 Min Avg Max
Power1 20.0 30.0 40.0 20.0 30.000000 40.0
Power2 10.0 15.0 13.0 10.0 12.666667 15.0
Max_Power1 30.0 50.0 90.0 30.0 56.666667 90.0
Max_Power2 50.0 80.0 40.0 40.0 56.666667 80.0
Voltage1 1.2 1.2 1.2 1.2 1.200000 1.2
Voltage2 1.3 1.3 1.3 1.3 1.300000 1.3
Current1 1.0 1.2 1.3 1.0 1.166667 1.3
Current2 2.0 2.1 2.2 2.0 2.100000 2.2
We assume that test
is:
test <- structure(list(Test1 = c(20, 10, 30, 50, 1.2, 1.3, 1, 2),
Test2 = c(30, 15, 50, 80, 1.2, 1.3, 1.2, 2.1), Test3 = c(40, 13, 90, 40, 1.2,
1.3, 1.3, 2.2)), .Names = c("Test1", "Test2", "Test3"),
class = "data.frame", row.names = c("Power1",
"Power2", "Max_Power1", "Max_Power2", "Voltage1", "Voltage2", "Current1", "Current2"))
Upvotes: 3
Reputation: 10855
The anomaly in this question is that we need to calculate some statistics that aren't typically calculated on rows. There are a number of ways to solve this problem.
First, we could use the rpgm
package and its rowMins()
and rowMaxs()
functions to augment base:rowMeans()
. Note that we want to only calculate on columns 2 - 4, because the Measurement column is not numeric.
rawData <- "Measurement Test1 Test2 Test3
Power1 20 30 40
Power2 10 15 13
Max_Power1 30 50 90
Max_Power2 50 80 40
Voltage1 1.2 1.2 1.2
Voltage2 1.3 1.3 1.3
Current1 1.0 1.2 1.3
Current2 2.0 2.1 2.2"
data <- read.table(text=rawData,header=TRUE,stringsAsFactors=FALSE)
# use rpgm package
library(rpgm)
rowMaxs(data[3:4,2:4])
rowMeans(data[c(1:2,5:6),2:4])
rowMins(data[7:8,2:4])
...and the output:
> library(rpgm)
> rowMaxs(data[3:4,2:4])
[1] 90 80
> rowMeans(data[c(1:2,5:6),2:4])
1 2 5 6
30.00000 12.66667 1.20000 1.30000
> rowMins(data[7:8,2:4])
[1] 1 2
Another alternative is to reshape the data using the reshape2
package so we can calculate the statistics on columns rather than rows.
library(reshape2)
data <- read.table(text=rawData,header=TRUE,stringsAsFactors=FALSE)
data2 <- melt(data,id=c("Measurement"))
data3 <- dcast(data2,variable ~ Measurement)
colMeans(data3[,6:9])
colMins(data3[,2:3])
colMaxs(data3[,4:5])
...and the output:
> library(reshape2)
> data <- read.table(text=rawData,header=TRUE,stringsAsFactors=FALSE)
> data2 <- melt(data,id=c("Measurement"))
> data3 <- dcast(data2,variable ~ Measurement)
> colMeans(data3[,6:9])
Power1 Power2 Voltage1 Voltage2
30.00000 12.66667 1.20000 1.30000
> colMins(data3[,2:3])
[1] 1 2
> colMaxs(data3[,4:5])
[1] 90 80
Upvotes: 1