Return the unique list of values based on values in another column

Question

I have a table with two columns that are alpha-numeric and would like to return all unique values from column "Terms Name" whenever in the column "Terms" the string "Error" appears:

The code I have come up with is as follows, but it only works when there is just one instance of value "Error" in column "Terms":

import pandas as pd
import re
df = pd.read_excel('Test.xlsx')
print(df.loc[df['Terms'].str.contains('Error', flags = re.IGNORECASE, na=False), 'Terms Name'].item())

When there are more values "Error" than one (like in my df above) I am getting the error: "ValueError: can only convert an array of size 1 to a Python scalar" and I just can't make it return the correct values: "20th NM" and "NET 45".

Thank you in advance!

SeaBean · Accepted Answer

Use Series.unique together with your filtering condition:

print(list(df['Terms Name'].loc[df['Terms'].str.contains('Error', flags = re.IGNORECASE, na=False)].unique()))

Output:

['20th NM', 'Net 45']

Details:

df['Terms Name'].loc[...] helps filter the condition
df['Terms'].str.contains('Error', flags = re.IGNORECASE, na=False) retains your filtering condition
....unique() the filtered subset of df['Terms Name'] is called with Series.unique() to get ndarray of the unique values
list(...) convert the resulting ndarray to Python list

If you just need to print the result rather than want to get a Python list of the unique values, you can skip the last step using list(...)

Return the unique list of values based on values in another column

Answers (1)

Related Questions