Reputation: 113
c:/somepath contains below files
file1 file2 file3
from "c:/somepath/", I am capturing all the file names in a list called users
users=[d for d in os.listdir("c:/somepath/") if os.path.isdir(os.path.join("c:/somepath/", d))]
Note: the number of files in the directory can differ, but the data in the file remains in the same format.
file1 contains below data
index user name %used
1 a 25
2 a 40
3 a 20
4 a 25
file2 contains below data
index user name %used
1 b 33
2 b 66
3 b 20
4 b 88
file3 contains below data
index user name %used
1 c 25
2 c 69
3 c 20
4 c 73
I need the result as shown below
index a b c
1 25 33 25
2 40 66 69
3 20 20 20
4 25 88 73
if you see the result, the %used column was renamed to respective "user name"
Could someone please tell me the easiest way to achieve this result.
Upvotes: 0
Views: 103
Reputation: 294488
IIUC:
pd.concat(pd.read_fwf(f, index_col=[0, 1]).squeeze() for f in users).unstack()
I dropped those files into my directory
print(*(p.read_text() for p in Path('.').glob('file*')), sep='\n\n')
index user name %used
1 a 25
2 a 40
3 a 20
4 a 25
index user name %used
1 b 33
2 b 66
3 b 20
4 b 88
index user name %used
1 c 25
2 c 69
3 c 20
4 c 73
Then I ran the above code
from pathlib import Path
pd.concat(
pd.read_fwf(f, index_col=[0, 1]).squeeze() for f in Path('.').glob('file*')
).unstack()
user name a b c
index
1 25 33 25
2 40 66 69
3 20 20 20
4 25 88 73
I'm using pandas read_fwf
to read a fixed width file. I set the first two columns as indices and squeeze the result. This produces a Series
. Then I use pd.concat
to put them all on top of one another. After that I use unstack
to unwind the second level of the index (which is user) into the columns.
Upvotes: 1
Reputation: 8631
You need:
pd.concat([df1.set_index(['index', 'user_name']), df2.set_index(['index', 'user_name']), df3.set_index(['index', 'user_name'])], axis=1).unstack().dropna(1)
Upvotes: 0