Nakeuh
Nakeuh

Reputation: 1919

Pandas : Get row from an ID which is a split of a column value

I have a pandas DataFrame df that contains a list of filename.

Here is an example :

print(df)

>>
+---------+---------+
|       ID|    Field|
+---------+---------+
|  AAA.png|        X|
|  BBB.jpg|        Y|
|  CCC.png|        Z|
+---------+---------+

From a given ID, which is the filename without the extension, I want to retrieve the value of the column Field.

For example, for my_id = BBB, I want to get the value Y.

To so, I tried the following thing :

my_id = BBB
field_value = df[df["ID"].str.split('.')[0] == my_id]["Field"]

But I get the error KeyError: False. I understand why I have this error but I don't know how I can do that in an other way.

Upvotes: 3

Views: 1252

Answers (3)

jezrael
jezrael

Reputation: 862771

First filter by boolean indexing with DataFrame.loc - output is Series:

field_value = df.loc[df["ID"].str.split('.').str[0] == my_id, "Field"]

And then for first value use next with iter:

first val = next(iter(field_value), 'no match')

If need all matched values in list:

L = field_value.tolist()

Upvotes: 3

prosti
prosti

Reputation: 46351

I tested with str.contains:

my_id="BBB"
field_values = df.loc[df["ID"].str.contains(my_id), "Field"]
print(field_values)

It can return multiple values as you can see. Also it is bullet prof for file names starting with ., like .AAA.png.


        ID Field
0  AAA.png     X
1  BBB.jpg     Y
2  CCC.png     Z
3  BBB.png     K
1    Y
3    K
Name: Field, dtype: object

Upvotes: 1

Rakesh
Rakesh

Reputation: 82765

Using os.path.splitext

Ex:

import os
import pandas as pd

df = pd.DataFrame({"ID": ["AAA.png", "BBB.png", "CCC.png"],
                   "Field": ["X", "Y", "Z"]})

my_id = "BBB"
mask = df["ID"].apply(os.path.splitext).str[0] == my_id
print(df[mask]["Field"])

Output:

1    Y
Name: Field, dtype: object

Upvotes: 0

Related Questions