Reputation: 300
I am trying to subset duplicate observations that have a difference in a singular column. In this example I'm trying to subset observations with the same id number but difference tag numbers. I'm planning on making my own function and then using the lapply function to go through my data set.
As of now my code looks like:
test.function <- (i) {
if(test.data[i, "id"] == test.data[i-1, "id"] &
test.data[i, "tag.num"] != test.data[i-1, "tag.num"]){
id.tag <- subset(i)
}
}
lapply (test.data, test.function)
I have a few questions regarding the above statement. Most importantly, I keep receiving:
Error: unexpected '{' in "test.data <- (i) {"
I'm really not sure why this keeps happening and any guidance would be appreciated.
Current data set looks like (999 is just a missing value indicator):
id tag.num
1000 999
1000 A49038483
1100 999
1100 A49294883
1200 999
1200 999
Once again, I am just trying to subset the same id with a different tag number. In this example I am trying to subset 4 observations of id 1000 and 1100.
Also, I am wondering about the syntax inside of my if statement and if it is necessary to have to specify my data set name. I'm hoping to apply this function on several different columns within my original data set. If there is a more general way in which I could set this up and then be able to run the lapply function through for all applicable columns, that would be great knowledge. Any and all help is appreciated.
Upvotes: 0
Views: 35
Reputation: 355
To define a function in R, the syntax is
fun.name <- function(args) {...}
so you need function(i)
where you have just (i)
above.
I'd also suggest that if you're trying to lapply
across the rows of your dataset, you probably don't need to do that.
It's not entirely clear to me what you're trying to do. Could you post a data sample and what you're hoping to get back?
Upvotes: 0
Reputation: 2552
As far as the error, you're missing a function
and a )
:
test.function <- function(i) {
if(test.data[i, "id"] == test.data[i-1, "id"] &
test.data[i, "tag.num"] != test.data[i-1, "tag.num"]){
id.tag <- subset(i)
}
}
runs without any errors.
Upvotes: 1