Rafiq Shaikh
Rafiq Shaikh

Reputation: 113

Merge Dataframe alonside and rename column

c:/somepath contains below files

file1 file2 file3

from "c:/somepath/", I am capturing all the file names in a list called users

users=[d for d in os.listdir("c:/somepath/") if os.path.isdir(os.path.join("c:/somepath/", d))]

Note: the number of files in the directory can differ, but the data in the file remains in the same format.

file1 contains below data

index   user name   %used
 1          a        25
 2          a        40
 3          a        20
 4          a        25

file2 contains below data

index   user name   %used
 1          b        33
 2          b        66
 3          b        20
 4          b        88

file3 contains below data

index   user name   %used
 1          c        25
 2          c        69
 3          c        20
 4          c        73

I need the result as shown below

index     a     b    c
  1       25    33   25
  2       40    66   69
  3       20    20   20
  4       25    88   73

if you see the result, the %used column was renamed to respective "user name"

Could someone please tell me the easiest way to achieve this result.

Upvotes: 0

Views: 103

Answers (2)

piRSquared
piRSquared

Reputation: 294488

IIUC:

pd.concat(pd.read_fwf(f, index_col=[0, 1]).squeeze() for f in users).unstack()

MCVE

I dropped those files into my directory

print(*(p.read_text() for p in Path('.').glob('file*')), sep='\n\n')

index   user name   %used
 1          a        25
 2          a        40
 3          a        20
 4          a        25

index   user name   %used
 1          b        33
 2          b        66
 3          b        20
 4          b        88

index   user name   %used
 1          c        25
 2          c        69
 3          c        20
 4          c        73

Then I ran the above code

from pathlib import Path

pd.concat(
    pd.read_fwf(f, index_col=[0, 1]).squeeze() for f in Path('.').glob('file*')
).unstack()

user name   a   b   c
index                
1          25  33  25
2          40  66  69
3          20  20  20
4          25  88  73

Explanation

I'm using pandas read_fwf to read a fixed width file. I set the first two columns as indices and squeeze the result. This produces a Series. Then I use pd.concat to put them all on top of one another. After that I use unstack to unwind the second level of the index (which is user) into the columns.

Upvotes: 1

harpan
harpan

Reputation: 8631

You need:

pd.concat([df1.set_index(['index', 'user_name']), df2.set_index(['index', 'user_name']), df3.set_index(['index', 'user_name'])], axis=1).unstack().dropna(1)

Upvotes: 0

Related Questions