Adrian
Adrian

Reputation: 55

Return the unique list of values based on values in another column

I have a table with two columns that are alpha-numeric and would like to return all unique values from column "Terms Name" whenever in the column "Terms" the string "Error" appears:

enter image description here

The code I have come up with is as follows, but it only works when there is just one instance of value "Error" in column "Terms":

import pandas as pd
import re
df = pd.read_excel('Test.xlsx')
print(df.loc[df['Terms'].str.contains('Error', flags = re.IGNORECASE, na=False), 'Terms Name'].item())

When there are more values "Error" than one (like in my df above) I am getting the error: "ValueError: can only convert an array of size 1 to a Python scalar" and I just can't make it return the correct values: "20th NM" and "NET 45".

Thank you in advance!

Upvotes: 0

Views: 320

Answers (1)

SeaBean
SeaBean

Reputation: 23217

Use Series.unique together with your filtering condition:

print(list(df['Terms Name'].loc[df['Terms'].str.contains('Error', flags = re.IGNORECASE, na=False)].unique()))

Output:

['20th NM', 'Net 45']

Details:

  • df['Terms Name'].loc[...] helps filter the condition
  • df['Terms'].str.contains('Error', flags = re.IGNORECASE, na=False) retains your filtering condition
  • ....unique() the filtered subset of df['Terms Name'] is called with Series.unique() to get ndarray of the unique values
  • list(...) convert the resulting ndarray to Python list

If you just need to print the result rather than want to get a Python list of the unique values, you can skip the last step using list(...)

Upvotes: 1

Related Questions