sds
sds

Reputation: 60014

Get non-null elements in a pandas DataFrame

I have a DataFrame and I want to get some non-null elements as a list.

Specifically, given df:

df = pd.DataFrame({"a":["A",None,"B"],"b":[None,"C","D"],"c":["E","F",None]})
      a     b     c
0     A  None     E
1  None     C     F
2     B     D  None

and the interesting columns list ["a","c"], I want to extract the list of non-None element of the specified columns, i.e.,

["A","B","E","F"]

I guess I can do

[value for colname in interesting_columns
 for value in df.loc[df[colname].notnull(),colname]]

but I was wondering if there is some non-iterative magic trick.

Upvotes: 2

Views: 3336

Answers (1)

akuiper
akuiper

Reputation: 215037

You can stack it to long format and retrieve the data with .values accessor. By default, stack() drops missing values automatically:

df[['a', 'c']].T.stack().values
# array(['A', 'B', 'E', 'F'], dtype=object)

Or if you want a list:

df[['a', 'c']].T.stack().tolist()
# ['A', 'B', 'E', 'F']

T is needed to get the values in the order you requested.

Upvotes: 3

Related Questions