kitkat007
kitkat007

Reputation: 1

Merging datasets into 1 - or are 2 separate dataframes better?

I'd like to connect each account with its clients and transactions. An account can relate to more clients.

But the df transaction only relates to account and not to df client.

As we have no way to know, which of the clients on one account makes the transaction: is there even a way to solve this problem with one dataset?

I'm thinking:

DATA:
(PKDD'99 Challenge, description bottom left here, Financial data description.

ERD of the data

df.head() for these datasets:

ACCOUNT df:
   account_id  district_id           frequency    date
0         576           55    POPLATEK MESICNE  930101
1        3818           74    POPLATEK MESICNE  930101
2         704           55    POPLATEK MESICNE  930101
3        2378           16    POPLATEK MESICNE  930101
4        2632           24    POPLATEK MESICNE  930102

DISPONENT df:
   disp_id  client_id  account_id       type
0        1          1           1      OWNER
1        2          2           2      OWNER
2        3          3           2  DISPONENT
3        4          4           3      OWNER
4        5          5           3  DISPONENT

CLIENT df:
   client_id  birth_number  district_id
0          1        706213           18
1          2        450204            1
2          3        406009            1
3          4        561201            5
4          5        605703            5

TRANSACTION df:
   trans_id  account_id    date    type operation  amount  balance 
0    695247        2378  930101  PRIJEM     VKLAD   700.0    700.0    
1    171812         576  930101  PRIJEM     VKLAD   900.0    900.0  
2    207264         704  930101  PRIJEM     VKLAD  1000.0   1000.0    
3   1117247        3818  930101  PRIJEM     VKLAD   600.0    600.0   
4    579373        1972  930102  PRIJEM     VKLAD   400.0    400.0 

Upvotes: 0

Views: 43

Answers (0)

Related Questions