Reputation: 95
Please help me to understand how to change dataframes in dictionary.
Let's consider the simplest case and create two dataframes and construct the dict from them.
dates = pd.date_range('20130101',periods=6)
df1 =pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
df2 =pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
DICTOR={}
DICTOR['d1']=df1
DICTOR['d2']=df2
m=DICTOR
Now I want to exclude rows from DataFrames inside dict m
, for example rows with where values in B
columns are zero or negative.
I tried following code:
for name,df in m.items():
for index, row in df.iterrows():
if df.at[index,'B']<0:
df.drop(index,axis=0)
or:
for name,df in m.items():
df=df[df.B>0]
but it does not work.
I guess my problem is due to mutable/immutable objects, but i'm not sure.
Upvotes: 4
Views: 1320
Reputation: 294586
If all of your dataframes have consistent indices, you should keep them together with a MultiIndex
df = pd.concat(m)
df
A B C D
d1 2013-01-01 -0.701856 1.804441 -1.224499 -0.997452
2013-01-02 -1.122829 -0.375963 1.476828 1.254910
2013-01-03 -0.330781 -0.692166 1.352655 -1.296063
2013-01-04 -0.352034 0.200128 0.411482 1.058941
2013-01-05 -0.103345 0.119615 0.251884 -0.108792
2013-01-06 0.690312 -1.115858 -0.271362 -0.872862
d2 2013-01-01 1.449789 0.144008 -0.445732 -0.356491
2013-01-02 0.254142 0.102233 -0.456786 1.505599
2013-01-03 -1.636609 0.141300 -1.458500 0.088640
2013-01-04 0.015575 1.170128 0.229888 -0.273040
2013-01-05 0.995011 -1.476076 -0.345353 -0.343009
2013-01-06 0.060094 0.610622 0.192916 -1.411557
At which point you can use numerous filtering methods
df.query('B > 0')
A B C D
d1 2013-01-01 -0.701856 1.804441 -1.224499 -0.997452
2013-01-04 -0.352034 0.200128 0.411482 1.058941
2013-01-05 -0.103345 0.119615 0.251884 -0.108792
d2 2013-01-01 1.449789 0.144008 -0.445732 -0.356491
2013-01-02 0.254142 0.102233 -0.456786 1.505599
2013-01-03 -1.636609 0.141300 -1.458500 0.088640
2013-01-04 0.015575 1.170128 0.229888 -0.273040
2013-01-06 0.060094 0.610622 0.192916 -1.411557
Upvotes: 1
Reputation: 10593
Change your loop to this:
for name,df in m.items():
for index, row in df.iterrows():
if df.at[index,'B']<0:
df.drop(index,axis=0, inplace=True)
Upvotes: 1
Reputation: 164843
You need to assign values to dictionary keys as you iterate:
for name, df in m.items():
m[name] = df[df['B'] > 0]
Otherwise, you're constantly overriding a variable df
and not storing it anywhere.
Upvotes: 2