Polar
Polar

Reputation: 147

Pandas columns by given value in last row

Below my dataframe "df" made of 34 columns (pairs of stocks) and 530 rows (their respective cumulative returns). 'Date' is the index

enter image description here

Now, my target is to consider last row (Date=3 Febraury 2021). I want to plot ONLY those columns (pair stocks) that have a positive return on last Date.

I started with:

n=list()
for i in range(len(df.columns)):
    if df.iloc[-1,i] >0:
        n.append(i)

Output: [3, 11, 12, 22, 23, 25, 27, 28, 30]

Now, final step is to create a subset dataframe of 'df' containing only columns belonging to those numbers in this list. This is where I have problems. Have you any idea? Thanks

Upvotes: 0

Views: 410

Answers (3)

Pygirl
Pygirl

Reputation: 13349

sample df:

             a    b      c
date            
2017-04-01  0.5   -0.7  -0.6
2017-04-02  1.0   1.0    1.3

df1.loc[df1.index.astype(str) == '2017-04-02', df1.ge(1.2).any()]

            c
date    
2017-04-02  1.3

the logic will be same for your case also.

Upvotes: 1

Enrico Gandini
Enrico Gandini

Reputation: 1015

If I understand correctly, you want columns with IDs [3, 11, 12, 22, 23, 25, 27, 28, 30], am I right? You should use DataFrame.iloc:

column_ids = [3, 11, 12, 22, 23, 25, 27, 28, 30]
df_subset = df.iloc[:, column_ids].copy()

The ":" on the left side of df.iloc means "all rows". I suggest using copy method in case you want to perform additional operations on df_subset without the risk to affect the original df, or raising Warnings.

If instead of a list of column IDs, you have a list of column names, you should just replace .iloc with .loc.

Upvotes: 0

KaRaOkEy
KaRaOkEy

Reputation: 328

Does this solve your problem?

n = []
for i, col in enumerate(df.columns):
    if df.iloc[-1,i] > 0:
        n.append(col)
        
df[n]

Here you are ;)

Upvotes: 1

Related Questions