paranormaldist
paranormaldist

Reputation: 508

Extracting column from dictionary of dataframes

I have a dictionary d where keys are associated with a pandas dataframe. So, I can execute print(d[0]['col1'] and I'm able to return col1 for key 0.

If I print d , I have something like this for key 0-

{0:                               col1       type         Weather  \
0                                    id    varchar                  
1                                    id    varchar                  
2                                    id    varchar                  
3                                    id    varchar                  
4                                    id    varchar                  
..                               ...        ...            ...   

The expected outcome becomes

{0:                     col1
0                       id
1                       id
2                       id
3                       id
4                       id
..                               ...

But how can I apply this to all keys? So my dictionary only includes col1. I've been stuck on this for awhile, any guidance appreciated.

Upvotes: 1

Views: 1413

Answers (2)

vht981230
vht981230

Reputation: 4490

Have you tried using dictionary comprehension?

import pandas as pd

df0 = pd.DataFrame({'col1': [1,2,3], 'col2': [3,4,5], 'col3': [5,6,7]})
df1 = pd.DataFrame({'col1': [7,8,9], 'col2': [10,11,12], 'col3': [13,14,15]})

d = {}
d[0] = df0
d[1] = df1


print({x:d[x]['col1'] for x in d})

If you want the output without index you can replace the last line with either

print({x:d[x]['col1'].values for x in d})

or

print({x:d[x]['col1'].tolist() for x in d})

Upvotes: 2

Warlax56
Warlax56

Reputation: 1202

I would do this:

import pandas as pd

d = {0:pd.DataFrame({'col0':[0,1,2], 'col1':[0,0,0]}),
    1:pd.DataFrame({'col0':[1,2,3], 'col1':[1,1,1]})}

def get_cols(d,cols):
    for key, value in d.items():
        d[key] = value[cols]
    return d

this will get you a series for the column chosen

print(get_cols(d,'col0'))

this will get you a df for the column chosen

print(get_cols(d,['col0']))

this will get you a df for the columns chosen

print(get_cols(d,['col0','col1']))

note: this strategy modifies d, as it is passed by reference. you could also return None and just use d, or deep copy d before modifying it in the function.

Upvotes: 2

Related Questions