Reputation: 309
I have a dataframe with 6 columns and many rows that includes positions for an individual tagged fish. The structure is as follows:
head(tag.29912)
Date.and.Time..UTC. Receiver Transmitter Latitude Longitude ndiffs29912
1 07/10/2010 15:53 VR2W-107619 A69-1303-29912 48.56225 -53.89144 NA
2 07/10/2010 15:56 VR2W-107619 A69-1303-29912 48.56225 -53.89144 180
3 07/10/2010 16:00 VR2W-107619 A69-1303-29912 48.56225 -53.89144 240
4 07/10/2010 16:24 VR2W-107619 A69-1303-29912 48.56225 -53.89144 1440
5 07/10/2010 16:45 VR2W-104556 A69-1303-29912 48.56460 -53.88956 1260
6 07/10/2010 16:47 VR2W-107619 A69-1303-29912 48.56225 -53.89144 120
The ndiffs29912 refers to the difference in time between detections - hence the first row has an NA because there is nothing previous to calculate a time difference with.
I would like to filter out any single detections that occur over 24 hours (86400sec), because these are likely spurious. I have tried the following code to try and remove them:
for (i in 1:length(tag.29912)) {
if (tag.29912[i,6]>=86400 & tag.29912[i+1,6]>=86400)
{rm(i)}
This has not worked. I have also tried:
for (i in 1:length(tag.29912)) {
if (tag.29912[i,6]>=86400 & tag.29912[i+1,6]>=86400)
{new<-tag.29912[i,]}
else{filteredtag.29912<-as.data.frame(tag.29912[-new])}
}
to no avail. Ultimately, I would like a new dataframe with all single detections removed. Any tips would be GREATLY appreciated!!
Upvotes: 0
Views: 176
Reputation: 173547
A couple of things:
A data frame is a list with some special requirements (i.e. each element of the list must be of the same length). One consequence of this is that length(tag.29912)
should return the length of the list, i.e. the number of columns, whereas in your loop you probably intended to loop over the number of rows.
You can pull out all these rows using vectorization, which is very very important to learn in R.
rm()
removes objects from your workspace, which is not what you're trying to do.
In your particular case you want to identify rows with values in the ndiffs29912
column with consecutive 86400 values and remove them.
So something like
tag.29912$flag <- FALSE
for (i in 2:(nrow(tag.29912) - 1){
if (tag.29912[i,6]>=86400 & tag.29912[i+1,6]>=86400){
tag.29912$flag[i] <- tag.29912$flag[i+1] <- TRUE
}
}
tag.29912 <- tag.29912[!tag.29912$flag,]
should give you what you want.
But by the looks of this code, though, I strongly recommend that you take a few hours and carefully spend some time with a basic manual for beginners.
Upvotes: 3