Peter Smit
Peter Smit

Reputation: 28736

How to select rows from data.frame with 2 conditions

I have an aggregated table:

> aggdata[1:4,]
  Group.1 Group.2         x
1       4    0.05 0.9214660
2       6    0.05 0.9315789
3       8    0.05 0.9526316
4      10    0.05 0.9684211

How can I select the x value when I have values for Group.1 and Group.2?

I tried:

aggdata[aggdata[,"Group.1"]==l && aggdata[,"Group.2"]==lamda,"x"]

but that replies all x's.

More info: I want to use this like this:

table = data.frame();
for(l in unique(aggdata[,"Group.1"])) {
    for(lambda in unique(aggdata[,"Group.2"])) {
        table[l,lambda] = aggdata[aggdata[,"Group.1"]==l & aggdata[,"Group.2"]==lambda,"x"]
    }
}

Any suggestions that are even easier and giving this result I appreciate!

Upvotes: 24

Views: 114528

Answers (3)

Pat Mc
Pat Mc

Reputation: 143

There is a really helpful document on subsetting R data frames at: http://www.ats.ucla.edu/stat/r/modules/subsetting.htm

Here is the relevant excerpt:

Subsetting rows using multiple conditional statements: There is no limit to how many logical statements may be combined to achieve the subsetting that is desired. The data frame x.sub1 contains only the observations for which the values of the variable y is greater than 2 and for which the variable V1 is greater than 0.6.

x.sub1 <- subset(x.df, y > 2 & V1 > 0.6)

Upvotes: 9

Ken Williams
Ken Williams

Reputation: 24005

The easiest solution is to change "&&" to "&" in your code.

> aggdata[aggdata[,"Group.1"]==6 & aggdata[,"Group.2"]==0.05,"x"]
[1] 0.9315789

My preferred solution would be to use subset():

> subset(aggdata, Group.1==6 & Group.2==0.05)$x
[1] 0.9315789

Upvotes: 28

Rob Hyndman
Rob Hyndman

Reputation: 31820

Use & not &&. The latter only evaluates the first element of each vector.

Update: to answer the second part, use the reshape package. Something like this will do it:

tablex <- recast(aggdata, Group.1 ~ variable * Group.2, id.var=1:2)
# Now add useful column and row names
colnames(tablex) <- gsub("x_","",colnames(tablex))
rownames(tablex) <- tablex[,1]
# Finally remove the redundant first column
tablex <- tablex[,-1]

Someone with more experience using reshape may have a simpler solution.

Note: Don't use table as a variable name as it conflicts with the table() function.

Upvotes: 15

Related Questions