Ammar Kamran
Ammar Kamran

Reputation: 139

Extract values from a column in a Dataframe based in starting letters

Hope everyone is doing well. I am using pandas and numpy and I would like to extract column values based on the first 3 letters (ap.) from a Dataframe. Below is an example of my Dataframe.

Name Number
Orange 2
APple 6
Ap.ricot 1
AP.19 1
Juap.rte 3

I've tried df[df['Name'].str.lower().str.contains('ap.', na=False)].Name.unique() but it does not fully do the trick.

Output:

['AP.19','Ap.ricot']

The output should ideally be a list that I can then save onto a variable. Additionally, the 3 letters need to be at the start and in this order.

I am very new to Python so please explain as clearly as possible. Thank you.

Upvotes: 2

Views: 1397

Answers (3)

Pygirl
Pygirl

Reputation: 13349

try:

df[df['Name'].str.match('^(ap[.])', case=False)].Name.unique() 

array(['Ap.ricot', 'AP.19'], dtype=object)

Upvotes: 1

Celius Stingher
Celius Stingher

Reputation: 18377

Given the comments in the post, I believe you can get it done with:

ap = [x for x in df['Name'] if x.lower().startswith('ap.')]

And if you wish to not have duplicates, then you can use:

ap = [x for x in df['Name'].unique() if x.lower().startswith('ap.')]

Upvotes: 2

Prateek Jain
Prateek Jain

Reputation: 196

This may help you:

final = []


df['NameCopy'] = df['Name'].str.lower()


for index,row in df.iterrows():
   if row['NameCopy'].find('ap.') != -1:
      final += [row['Name']]
   else:
      pass

print(final)

Upvotes: 1

Related Questions