Kactus
Kactus

Reputation: 122

Filter a data frame by column sums

I have a data frame set up like the one below (plot vs species occurrence data).

df=data.frame(plot=c(1, 2, 3, 4, 5, 6, 7, 8, 9), speciesA=c(5, 0, 10, 0, 8, 45, 0, 0, 17), speciesB = c(0, 0, 0, 0, 0, 0, 0, 0, 0), speciesC = c(0.7, 0, 17, 0, 0, 8, 0, 9, 0), species D = c(1, 0, 0, 3, 0, 0, 0, 9, 1))

I need to be able to create a second data frame (or subset this one) that contains only species that occur in greater than 4 plots. I used colSums to sount the number of occurances > 0 for each column, but cannot apply that to filtering the data frame.
colSums(df != 0) df2 <- df[,which(apply(df,2,colSums)> 4)]

Any suggestions?

Upvotes: 1

Views: 1420

Answers (1)

Andrew Gustar
Andrew Gustar

Reputation: 18425

How about this...

df2 <- df[,colSums(df>0)>4]

df2
  plot speciesA
1    1        5
2    2        0
3    3       10
4    4        0
5    5        8
6    6       45
7    7        0
8    8        0
9    9       17

Upvotes: 2

Related Questions