Muhammad
Muhammad

Reputation: 129

Unique values from multipel column in pandas

distinct_values = df.col_name.unique().compute()

But what if I don't know the names of columns.

Upvotes: 2

Views: 134

Answers (2)

E. Zeytinci
E. Zeytinci

Reputation: 2643

You can try this,

>>> import pandas as pd
>>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [2, 3, 5]})
>>> d = dict()
>>> d['any_column_name'] = pd.unique(df.values.ravel('K'))
>>> d
{'any_column_name': array([1, 2, 3, 5])}

or for just one feature,

>>> d = dict()
>>> d['a'] = df['a'].unique()
>>> d
{'a': array([1, 2, 3])}

or individually for all,

>>> d = dict()
>>> for col in df.columns:
...     d[col] = df[col].unique()
...
>>> d
{'a': array([1, 2, 3]), 'b': array([2, 3, 5])}

Upvotes: 1

Sociopath
Sociopath

Reputation: 13401

I think you need:

df = pd.DataFrame({"colA":['a', 'b', 'b', 'd', 'e'], "colB":[1,2,1,2,1]})

unique_dict = {}

# df.columns will give you list of columns in dataframe
for col in df.columns:
    unique_dict[col] = list(df[col].unique())

Output:

{'colA': ['a', 'b', 'd', 'e'], 'colB': [1, 2]}

Upvotes: 1

Related Questions