Reputation: 139
Hope everyone is doing well. I am using pandas and numpy and I would like to extract column values based on the first 3 letters (ap.) from a Dataframe. Below is an example of my Dataframe.
Name | Number |
---|---|
Orange | 2 |
APple | 6 |
Ap.ricot | 1 |
AP.19 | 1 |
Juap.rte | 3 |
I've tried df[df['Name'].str.lower().str.contains('ap.', na=False)].Name.unique() but it does not fully do the trick.
Output:
['AP.19','Ap.ricot']
The output should ideally be a list that I can then save onto a variable. Additionally, the 3 letters need to be at the start and in this order.
I am very new to Python so please explain as clearly as possible. Thank you.
Upvotes: 2
Views: 1397
Reputation: 13349
try:
df[df['Name'].str.match('^(ap[.])', case=False)].Name.unique()
array(['Ap.ricot', 'AP.19'], dtype=object)
Upvotes: 1
Reputation: 18377
Given the comments in the post, I believe you can get it done with:
ap = [x for x in df['Name'] if x.lower().startswith('ap.')]
And if you wish to not have duplicates, then you can use:
ap = [x for x in df['Name'].unique() if x.lower().startswith('ap.')]
Upvotes: 2
Reputation: 196
This may help you:
final = []
df['NameCopy'] = df['Name'].str.lower()
for index,row in df.iterrows():
if row['NameCopy'].find('ap.') != -1:
final += [row['Name']]
else:
pass
print(final)
Upvotes: 1