RHelp
RHelp

Reputation: 835

First and Last in SAS and R

I am working on converting a SAS code to R but I am having trouble replicationg the IF First. & Last. command in R. The SAS command is -

Data A;
Set B;
BY CompID, Id, Date;
IF First.Date;
run;

My understanding is that only the earliest date for a CompID, ID and Date combination is chosen and output into data A. Am I right?

I am aware of the duplicated command in R but if I use the following code -

A <- B[!duplicated(B$Date),]

I get lesser observations than my SAS output. Am I missing on something here?

Thanks in advance.

Upvotes: 0

Views: 1473

Answers (2)

IRTFM
IRTFM

Reputation: 263301

The construction in R could be (since there is also a duplicated.data.frame function):

A <- B[!duplicated(B[ c('CompID', 'Id', 'Date') ] ) ,]

To duplicate a .Last operation, look at the help page for duplicate and I think you will find some sort of fromLast parameter, but I always need to check its spelling.

The construction: "I get lesser observations than ..." sounds wrong to me, but I have not traveled in all the English speaking countries. At least in the US, I think "fewer" or " a lower count" would read a bit easier.

Upvotes: 2

user1509107
user1509107

Reputation:

First of all the statement BY CompID, Id, Date; should not have any commas in it.

Secondly, A <- B[!duplicated(B$Date),] is not the equivalent of the SAS code you posted.

The correct equivalent would be:

Data A;
Set B;
BY Date;
IF First.Date;
run;

My understanding is that only the earliest date for a CompID, ID and Date combination is chosen and output into data A. Am I right?

Your understanding is correct.

Upvotes: 1

Related Questions