DPek
DPek

Reputation: 300

function syntax // Generalizing the function

I am trying to subset duplicate observations that have a difference in a singular column. In this example I'm trying to subset observations with the same id number but difference tag numbers. I'm planning on making my own function and then using the lapply function to go through my data set.

As of now my code looks like:

test.function <- (i) {
  if(test.data[i, "id"] == test.data[i-1, "id"] &
     test.data[i, "tag.num"] != test.data[i-1, "tag.num"]){
   id.tag <- subset(i)
   }
}

lapply (test.data, test.function)

I have a few questions regarding the above statement. Most importantly, I keep receiving:

Error: unexpected '{' in "test.data <- (i) {"

I'm really not sure why this keeps happening and any guidance would be appreciated.

Current data set looks like (999 is just a missing value indicator):

id     tag.num
1000   999
1000   A49038483
1100   999
1100   A49294883
1200   999
1200   999

Once again, I am just trying to subset the same id with a different tag number. In this example I am trying to subset 4 observations of id 1000 and 1100.

Also, I am wondering about the syntax inside of my if statement and if it is necessary to have to specify my data set name. I'm hoping to apply this function on several different columns within my original data set. If there is a more general way in which I could set this up and then be able to run the lapply function through for all applicable columns, that would be great knowledge. Any and all help is appreciated.

Upvotes: 0

Views: 35

Answers (2)

Alex Gold
Alex Gold

Reputation: 355

To define a function in R, the syntax is

fun.name <- function(args) {...}

so you need function(i) where you have just (i) above.

I'd also suggest that if you're trying to lapply across the rows of your dataset, you probably don't need to do that.

It's not entirely clear to me what you're trying to do. Could you post a data sample and what you're hoping to get back?

Upvotes: 0

BLT
BLT

Reputation: 2552

As far as the error, you're missing a function and a ):

test.function <- function(i) {
  if(test.data[i, "id"] == test.data[i-1, "id"] &
     test.data[i, "tag.num"] != test.data[i-1, "tag.num"]){
       id.tag <- subset(i)
     }
}

runs without any errors.

Upvotes: 1

Related Questions