Reputation: 3
I have a named matrix with the following 3 part name structure (xxx-#h-#):
xxx-0h-0 | xxx-0h-1 | xxx-0h-2 | xxx-1h-0 | ... | xxx-60h-2
v1
v2
v3
...
vn
I am attempting to find out which columns share a name searched up with a concatenate of the first two parts of the name where xxx is a fixed value and the variable "names" contains all the possible values for the middle position. The last position is variable.
names <- c("0h","1h","6h","16h","24h","42h","60h")
names <-paste("XXX",names,sep=" ")
I am using grep for the lookup:
grep(names[1],colnames(x))
Which correctly returns:
[1] 1 2 3
I then attempt to merge the resulting columns by a cbind to then obtain the mean of all observations that share a first and second column naming position and assign it to a new variable.
Where
xxx-1h <- rowMeans(cbind(x[,grep(names[1],colnames(x))]))
Would give me the corresponding mean calculated from columns 1,2,3 which were previously found by grep,
when i fail to specify a subset of the "names" vector, I receive the following error:
Warning message:
In grep(names, colnames(x)) :
argument 'pattern' has length > 1 and only the first element will be used
How can i incorporate more than just the first element in a sequence?
Essentially, i'd like the following to happen:
xxx-0H <- rowMeans(cbind(x[,grep(names[1],colnames(x))]))
xxx-1H <- rowMeans(cbind(x[,grep(names[2],colnames(x))]))
xxx-6H <- rowMeans(cbind(x[,grep(names[3],colnames(x))]))
xxx-16H <- rowMeans(cbind(x[,grep(names[4],colnames(x))]))
xxx-24H <- rowMeans(cbind(x[,grep(names[5],colnames(x))]))
xxx-42H <- rowMeans(cbind(x[,grep(names[6],colnames(x))]))
xxx-60H <- rowMeans(cbind(x[,grep(names[7],colnames(x))]))
and concatenate each of the resulting integer vectors, into a matrix conserving the row naming scheme (which is shared among all columns), while omitting the last digit from the column names (xxx-0H | xxx-1H | xxx-2H). I would end up with a 7 column, n row matrix.
My last resort would be to use a for loop. Is there an elegant way to do this using apply or any of its variants?
Upvotes: 0
Views: 744
Reputation: 1421
Edit: Right, I see what you're looking for now. Here's a full example, starting with two pairs of columns that share a middle name.
mid <- c("0h", "6h")
name <- paste(rep("XXX", 4), rep(mid, each = 2), 1:2, sep="-")
df = setNames(cbind(cars, cars), name)
df = df[1:4, ]
df
# XXX-0h-1 XXX-0h-2 XXX-6h-1 XXX-6h-2
# 1 4 2 4 2
# 2 4 10 4 10
# 3 7 4 7 4
# 4 7 22 7 22
With the data set up, call rowMeans
over the table as many times as there are middle names, each time subsetting the table to the columns whose names include a given middle name.
sapply(mid, function(x) rowMeans(df[grep(x, names(df))]))
# 0h 6h
# 1 3.0 3.0
# 2 7.0 7.0
# 3 5.5 5.5
# 4 14.5 14.5
Upvotes: 0