JPWilson
JPWilson

Reputation: 759

How Can I drop a column if the last row is nan

I have found examples of how to remove a column based on all or a threshold but I have not been able to find a solution to my particular problem which is dropping the column if the last row is nan. The reason for this is im using time series data in which the collection of data doesnt all start at the same time which is fine but if I used one of the previous solutions it would remove 95% of the dataset. I do however not want data whose most recent column is nan as it means its defunct.

A B C
nan t x 
1 2 3
x y z
4 nan 6

Returns

A C
nan x
1 3
x z
4 6

Upvotes: 9

Views: 2380

Answers (5)

Knl_Kolhe
Knl_Kolhe

Reputation: 199

for i in range(temp_df.shape[1]):
    if temp_df.iloc[-1,i] == 'nan':
        temp_df = temp_df.drop(i,1)

This will work for you. Basically what I'm doing here is looping over all columns and checking if last entry is 'nan', then dropping that column. temp_df.shape[1] this is the numbers of columns.

pandas.df.drop(i,1) i represents the column index and 1 represents that you want to drop the column.

EDIT: I read the other answers on this same post and it seems to me that notna would be best (I would use it), but the advantage of this method is that someone can compare anything they wish to. Another method I found is isnull() which is a function in the pandas library which will work like this:

for i in range(temp_df.shape[1]):
    if temp_df.iloc[-1,i].isnull():
        temp_df = temp_df.drop(i,1)

Upvotes: 1

Nuno B. Brandao
Nuno B. Brandao

Reputation: 76

You can use .iloc, .loc and .notna() to sort out your problem.

df = pd.DataFrame({"A":[np.nan, 1,"x",4],  
                   "B":["t",2,"y",np.nan],
                   "C":["x",3,"z",6]})
 
df = df.loc[:,df.iloc[-1,:].notna()]

Upvotes: 3

Agaz Wani
Agaz Wani

Reputation: 5684

You can also do something like this

df.loc[:, ~df.iloc[-1].isna()]
    A   C
0   NaN x
1   1   3
2   x   z
3   4   6

Upvotes: 5

BENY
BENY

Reputation: 323306

Try with dropna

df = df.dropna(axis=1, subset=[df.index[-1]], how='any')
Out[8]: 
     A  C
0  NaN  x
1    1  3
2    x  z
3    4  6

Upvotes: 4

Michael Szczesny
Michael Szczesny

Reputation: 5036

You can use a boolean Series to select the column to drop

df.drop(df.loc[:,df.iloc[-1].isna()], axis=1)

Out:

     A  C
0  NaN  x
1    1  3
2    x  z
3    4  6

Upvotes: 2

Related Questions