alex
alex

Reputation: 109

R language retrieve strings matched

I need to retrieve string that contains "ARINA" word only. For instance there is a dataset light with the following columns:

CASEID
TIME_ELAPSED

Most of the string (light$CASEID) looks like:

CASEID 1 ARINA LIVES IN PARIS
CASEID 2 FRANCO LIVES IN SYDNEY
CASEID 3 ARINA WORKS FOR XXX COMPANY
CASEID 4 CARINA LIVES IN LIVERPOOL

etc

I tried to use the following expression to find only string where 'ARINA' exists:

light[grep("ARINA", light$CASEID),]

I would like to get CASEID 1 and 3 but in reality I get CASEID 1,3, and 4 (CARINA)

Upvotes: 0

Views: 44

Answers (2)

Reuben L.
Reuben L.

Reputation: 2859

light[grep("[^A-Za-z]ARINA[^A-Za-z]",light$CASEID)] would work if you are only dealing with letters.

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174776

Use word boundaries in-order to do an exact string match.

light[grep("\\bARINA\\b", light$CASEID),]

or if you need the actual value of CASEID, you can skip the subsetting step by specifying value = TRUE within grep

grep("\\bARINA\\b", light$CASEID, value = TRUE)

Upvotes: 3

Related Questions