Reputation: 11
I have tried to find common question, but any of the is just like this.
I'am trying to filter my data table with colSums. This means that if colSums gives certain amount(lets say under 5000) I want to include or exclude this certain column and I want to repeat this with loop or apply that it does this to whole data table. Basically this shouldn't be that hard, but I'm not sure what I'm doing wrong, maybe someone can help from here.
Below there is preperesation of my data and my code. I used dput function to reprepesent the data.
There are many different codes that i have tried, but none of them have worked. I thinks this is closest, but when I use code line from below, it gives me this type of warning message: "Error: expecting a one sided formula, a function, or a function name."
I have been using dplyr package, but others should be base functions.
> dput(data999[1:2, ])
KER000_349094 = c(0.1806,
0.1806), KER000_349085 = c(0.1832, 0.1832), KER000_351771 = c(0.1858,
0.1858), KER000_60103549 = c(0.1034, 0.1034), KER000_391452 = c(0.0016,
0.0016), KER000_345696 = c(0.1718, 0.1718), KER000_342793 = c(0.189230769230769,
0.189230769230769), KER000_345615 = c(0.0165384615384615,
0.0165384615384615), KER000_344065 = c(0.0592307692307692,
0.0592307692307692), KER000_353687 = c(0.188076923076923,
0.188076923076923), KER000_340589 = c(2.44, 2.44), KER000_346489 = c(0,
0), KER000_348357 = c(0.16, 0.16), KER000_363845 = c(3.135,
3.135), KER000_60029018 = c(0.115, 0.115), KER000_341255 = c(0,
0)), row.names = 1:2, class = "data.frame")
jeejee = apply(data999, 2, function(x) select_if(colSums(x <= 5000)))
Upvotes: 0
Views: 1319
Reputation: 10375
Copying my comment, since it seems to be the answer.
data999[,colSums(data999)<=5000]
to select all columns whose sum is <= 5000.
Upvotes: 1