sn4ke
sn4ke

Reputation: 619

python pandas row startswith one letter one number wildcard

Trying to filter out rows in my data, I need to match the first letter N followed by a number, then drop rows that don't match this criteria.

I've tried multiple regex combinations from stackoverflow but they don't seem to work properly

new = new.loc[new['call_x'].str.startswith("^[N]{1}[0-9]+")]

Example data
N902AG #keep
N917GA #keep
N918PD #keep
N919PD #keep
N930EN #keep
N940CL #keep
N976TR #keep
N98AW #keep
NAX6700 #drop
NAX7019 #drop
NKS1028 #drop
NKS171 #drop
NKS174 #drop
NKS197 #drop

Upvotes: 2

Views: 3553

Answers (2)

Mohammad Yusuf
Mohammad Yusuf

Reputation: 17074

Pandas str.startswith doesn't accept regex. You want str.match. Try this:

df[df.Example.str.match('^N\d+')]

str.contains is similar but looks for matches anywhere in the string, not just the start.

Upvotes: 2

gzc
gzc

Reputation: 8629

Use pandas.Series.str.contains to match regexp.

df = df.loc[df['a'].str.contains('^N[0-9]+')]

Upvotes: 3

Related Questions