Reputation: 443
here is my questions: I got data with 3000 obs. and 5000 features, the 3000 obs. has a numeric names like 100.1,100.3,100.5,100.7. I changed the names into a integer variables by segs <-as.integer(names)
, then I want to use segs
as a factor to sum all of the 3000 features. The length of the segs
is 300 so the final data frame is 300 by 5000. I know tapply
could be used to get the sum by factor for one variable but I have to use for
to get all of the 5000 features summed. It is really time-consuming, so I want to know if there is a clear way in R to solve those problems or if there are some packages to solve this kind of problem.
This is the dirty code and df0
is the data while df
is what I want:
df <- data.frame()
for(i in 2:ncol(df0)-1){
temp <- tapply(df0[,i],df2$segs,sum)
df <- cbind(df,temp)
}
Thanks!
=====
Thanks, Roland, a demo data is shown as follows:
set.seed(42)
df0 <- data.frame(
X = rnorm(100,10,10),
Y = rnorm(100),
Z = rnorm(100))
df0$seq <- as.integer(df0$X)
Upvotes: 1
Views: 151
Reputation: 4126
Try this...
set.seed(42)
df0 <- data.frame(
X = rnorm(100,10,10),
Y = rnorm(100),
Z = rnorm(100))
df0$seq <- as.integer(df0$X)
library(data.table)
dt = data.table(df0)
dt[,lapply(.SD, sum), by=seq ]
seq X Y Z
1: 23 164.8144774 1.293768670 -3.74807730
2: 4 8.9247301 1.909529066 -0.06277254
3: 13 40.2090180 -2.036599633 0.88836392
4: 16 147.8571697 -2.571487358 -1.35542918
5: 14 72.1640142 0.432493959 -1.49983832
6: 8 42.8498355 -0.582031919 -1.35989852
7: 25 75.9995653 0.896369560 -1.08024329
8: 9 27.5244048 0.833429855 -1.19363017
9: 30 30.1842371 0.188193035 -0.64574372
10: 32 32.8664539 0.108072728 2.03697217
11: -3 -7.5714175 -0.899304085 -1.27286230
12: 7 29.6254908 -0.929790177 2.75906514
27: 12 50.2535374 -0.620793351 -3.80900436
28: 24 24.4410126 -0.433169033 -0.02671746
29: -19 -19.9309008 -0.533492330 -1.01759612
30: 11 11.8523056 -1.071782384 0.96954501
31: 19 38.5407490 -0.751408534 -4.81312992
32: 0 -0.9642319 1.453325156 2.20977601
33: -1 -4.3685646 -0.834654913 -0.24624546
34: 18 18.2177311 -1.594588162 0.27369527
35: -4 -4.5921400 0.586487537 0.86256338
Upvotes: 2