Reputation: 359
I have a data frame that looks like this:
structure(list(A = c(70, 70, 70, 70, 70, 70), T = c(0.1, 0.2,
0.3, 0.4, 0.5, 0.6), X = c(434.01, 434.01, 434.75, 434.75, 434.75,
434.01), Y = c(454.92, 454.92, 454.92, 454.92, 454.18, 454.92
), V = c(0, 0, 21.128, 0, 14.94, 14.94), thetarad = c(0.151841552716899,
0.151841552716899, 0.150990672182432, 0.150990672182432, 0.150177486839524,
0.151841552716899), thetadeg = c(8.69988012340509, 8.69988012340509,
8.6511282599214, 8.6511282599214, 8.6045361718215, 8.69988012340509
)), .Names = c("A", "T", "X", "Y", "V", "thetarad", "thetadeg"
), row.names = 1423:1428, class = "data.frame")
I want to subset specific time points in R with intervals of 30 sec. I can do this by manually subsetting each time point that I want:
a1=subset(binA, T==0.1)
a2=subset(binA, T==30)
a3=subset(binA, T==60)
a4=subset(binA, T==90)
a5=subset(binA, T==120)
a6=subset(binA, T==150)
a7=subset(binA, T==180)
a8=subset(binA, T==210)
a9=subset(binA, T==240)
a10=subset(binA, T==270)
a11=subset(binA, T==300)
a12=subset(binA, T==330)
a13=subset(binA, T==360)
a14=subset(binA, T==390)
a15=subset(binA, T==420)
a16=subset(binA, T==450)
a17=subset(binA, T==480)
a18=subset(binA, T==510)
a19=subset(binA, T==540)
a20=subset(binA, T==570)
a21=subset(binA, T==599.5)
I tried subsetting using sapply
and the seq
function but got confusing results. I also want to count the unique A in each subset of data. I also know I can do this using the count
function in plyr
package.
a1=count(unique(subset(binA, T==0.1)))
but count will work with one data frame and not multiple ones (correct me if I am wrong). I also want to take the means of thetadeg for each subset (this should be easy for sapply in one data frame only). So I need help on how to write a function with specific seq points.
I know this problem is trivial but help would be appreciated.
Thanks
Upvotes: 1
Views: 1093
Reputation: 263411
The function I think you want is split
:
subsetted.by.T <- split(dfrm, dfrm$T)
lapply(subsetted.by.T, nrow)
$`0.1`
[1] 1
$`0.2`
[1] 1
$`0.3`
[1] 1
$`0.4`
[1] 1
$`0.5`
[1] 1
$`0.6`
[1] 1
> subsetted.by.T[[1]]
A T X Y V thetarad thetadeg
1423 70 0.1 434.01 454.92 0 0.1518416 8.69988
If you want to name these individual items, then the names<-
function would be appropriate:
names(subsetted.by.T) <- paste0("a", seq(length(subsetted.by.T) ) )
If the "T" column were somewhat irregular in its values, then perhaps using cut
to create categories at regular breaks would be useful for the purpose of splitting. The question might be clarified if "T" were actually a time value. At the moment it's a "numeric" value, but there are cut methods for datetime classes.
Upvotes: 0
Reputation: 3711
If purpose is just to get average, unique count etc, you don't need to subset.and one more thing, id T factor is is continuous and you need to make the bins? here I am assuming factor
here is one approach with plyr
ddply(df,~T,summarise,l=length(unique((A))))
ddply(df,~T,summarise,m=mean(thetadeg))
Upvotes: 0
Reputation: 12905
You should be able to use the following code to get what you want. This doesn't look for 0.1 and 599.5 but that should be easy to manipulate.
timeintervals <- seq(0,600, 30)
for(i in 1:length(timeintervals)
{
# create the subsets for each time interval
assign(
paste0("a",i),
df[df$T == timeintervals[i],]
)
# get all unique As
assign(
paste0("b",i),
unique(df[df$T == timeintervals[i],"A"])
)
}
Upvotes: 0
Reputation: 56219
Assuming data is in df
data frame then, try this:
sapply(c(0.1,seq(30,599,30),599.5),
function(x)
length(unique(df[ df$T==x, "A"])))
Upvotes: 1