bzzbzzRzzbzz
bzzbzzRzzbzz

Reputation: 111

grepl matching strictly only certain parts of words

I need to know if there exists a solution.

Let's say that we have a list that contains as follows:

id Item
1  "CRANBERRY 10PKTS CARTON, BLUEBERRY 20PKTS CARTON"
2  "CRANBERRY 10PKTS CARTON,BLUEBERRY 20PKTS CARTON"
3  "CRANBERRY 10PKTS CARTON"
4  "CRANBERRY 30PKTS CARTON"

What I would like is to match for only "CRANBERRY" and its associated names. The crux here is when something like id1 is present, grepl should return a false, since it not only contains cranberry, but has blueberry as well.

Is there a way for grepl to return false for id1 and id2, but true for id3 and id4? Preferably, a single grepl sentence is all that's needed for the problem.

Thanks in advance.

Upvotes: 1

Views: 493

Answers (1)

akrun
akrun

Reputation: 887118

Based on the example, the pattern seems to be that the words 'CRANBERRY', 'BLUEBERRY' etc. occurs once in each set of words separated by a ,. If that is the case, we can match the word 'CRANBERRY' in a sentence from the start of the string (^) followed by characters that are not a , ([^,]+) until the end of the string ($)

grepl("^.*\\bCRANBERRY[^,]+$", df1$Item)
#[1] FALSE FALSE  TRUE  TRUE

Upvotes: 1

Related Questions