SerKo
SerKo

Reputation: 95

Change dataframes in dict

Please help me to understand how to change dataframes in dictionary.

Let's consider the simplest case and create two dataframes and construct the dict from them.

dates = pd.date_range('20130101',periods=6)
df1 =pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
df2 =pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
DICTOR={}
DICTOR['d1']=df1
DICTOR['d2']=df2
m=DICTOR

Now I want to exclude rows from DataFrames inside dict m, for example rows with where values in B columns are zero or negative.

I tried following code:

for name,df in m.items():
     for index, row in df.iterrows():
         if df.at[index,'B']<0:
             df.drop(index,axis=0)

or:

for name,df in m.items():
    df=df[df.B>0]

but it does not work.

I guess my problem is due to mutable/immutable objects, but i'm not sure.

Upvotes: 4

Views: 1320

Answers (3)

piRSquared
piRSquared

Reputation: 294586

If all of your dataframes have consistent indices, you should keep them together with a MultiIndex

df = pd.concat(m)

df

                      A         B         C         D
d1 2013-01-01 -0.701856  1.804441 -1.224499 -0.997452
   2013-01-02 -1.122829 -0.375963  1.476828  1.254910
   2013-01-03 -0.330781 -0.692166  1.352655 -1.296063
   2013-01-04 -0.352034  0.200128  0.411482  1.058941
   2013-01-05 -0.103345  0.119615  0.251884 -0.108792
   2013-01-06  0.690312 -1.115858 -0.271362 -0.872862
d2 2013-01-01  1.449789  0.144008 -0.445732 -0.356491
   2013-01-02  0.254142  0.102233 -0.456786  1.505599
   2013-01-03 -1.636609  0.141300 -1.458500  0.088640
   2013-01-04  0.015575  1.170128  0.229888 -0.273040
   2013-01-05  0.995011 -1.476076 -0.345353 -0.343009
   2013-01-06  0.060094  0.610622  0.192916 -1.411557

At which point you can use numerous filtering methods

df.query('B > 0')

                      A         B         C         D
d1 2013-01-01 -0.701856  1.804441 -1.224499 -0.997452
   2013-01-04 -0.352034  0.200128  0.411482  1.058941
   2013-01-05 -0.103345  0.119615  0.251884 -0.108792
d2 2013-01-01  1.449789  0.144008 -0.445732 -0.356491
   2013-01-02  0.254142  0.102233 -0.456786  1.505599
   2013-01-03 -1.636609  0.141300 -1.458500  0.088640
   2013-01-04  0.015575  1.170128  0.229888 -0.273040
   2013-01-06  0.060094  0.610622  0.192916 -1.411557

Upvotes: 1

ignoring_gravity
ignoring_gravity

Reputation: 10593

Change your loop to this:

for name,df in m.items():
     for index, row in df.iterrows():
         if df.at[index,'B']<0:
             df.drop(index,axis=0, inplace=True)

Upvotes: 1

jpp
jpp

Reputation: 164843

You need to assign values to dictionary keys as you iterate:

for name, df in m.items():
    m[name] = df[df['B'] > 0]

Otherwise, you're constantly overriding a variable df and not storing it anywhere.

Upvotes: 2

Related Questions