BanditoKubito
BanditoKubito

Reputation: 101

Automatic creation of a DataFrames in Pandas

I am trying to create an accounting tool for myself. I have DataFrame that look like this:

d = {'a': [1000, 2000, 3000], 'x': [999, 888, 555], 'y': [555, 999, 888]}
df = pd.DataFrame(data=d)

     a       x       y
0   1000    999     555
1   2000    888     999
2   3000    555     888

Where x and y is corresponding number of account (i.e. 999 means bank account, etc.) and a is the price in dollars. And I would like to create new DataFrame for each account, that contains value from column a in a corresponding row instead of the account number.

For 999 it would look like this

    x    y
0  1000  0
1   0   2000

For 555 it would look like this

    x    y
0   0  1000
1  3000  0

and so on.

I performed this code for the first account, and it works, but it seems too complicated.

df2 = df.copy(deep=True)
df2 = df[(df2.x == 999) | (df2.y == 999)]
def fx(p):
    if p == 999:
        return 1
    else:
        return 0
df2.x = df2.x.apply(fx)
df2.y = df2.y.apply(fx)
df2.x = df2.x.replace(1, df2.a)
df2.y = df2.y.replace(1, df2.a)
del df2['a']

Is there a way to simplify it and perform this action for every account? I don't want to copy the code and paste the code for every account.

Thank you in advance, I am stuck with this for a couple of days now.

I am using python 2.7.12 on Ubuntu 16.04.4 Xenial

Upvotes: 1

Views: 1466

Answers (1)

jezrael
jezrael

Reputation: 863166

You can create dictionary of DataFrames with keys by unique values of x and y columns:

#convert columns to numpy array
arr = df[['x','y']].values
a = df['a'].values

#empty dictionary
dfs = {}
#loop by all unique values
for i in np.unique(arr.ravel()):
    #create 2d boolean mask
    mask = (arr == i)
    #convert mask to integers - Trues are 1 anf False are 0 and multiple by a 
    out = a[:, None] * mask.astype(int)
    #filter out only 0 rows and create DataFrame
    df = pd.DataFrame(out[mask.any(axis=1)], columns=['x','y'])
    #print (df)
    #add df to dict
    dfs[i] = df

Select by lookup:

print (dfs[999])   
      x     y
0  1000     0
1     0  2000

print (dfs[555])
      x     y
0     0  1000
1  3000     0

Upvotes: 2

Related Questions