Reputation: 25
Could someone tell me how to convert this table to an other state?
Just like this:
frequency
event day enrollment_id
access 01 14333 2
14343 14
14391 1
14469 1
35220 9
35307 2
35442 2
61263 1
140486 1
02 14320 9
14321 4
14322 30
14324 3
14325 2
wiki 01 35517 1
76933 1
111900 3
02 47872 1
47875 1
148426 1
03 60416 1
96487 2
96502 2
96505 1
96513 2
2014-07-24 200888 1
Upvotes: 1
Views: 44
Reputation: 862471
I think if need change second level of MultiIndex
to days:
a = df.index.get_level_values(0)
b = df.index.get_level_values(1).day
c = df.index.get_level_values(2)
df.index = pd.MultiIndex.from_arrays([a,b,c], names=['event', 'day', 'enrollment_id'])
But if need count second level for b
use:
b = df.groupby(level=1).cumcount().add(1).astype(str).str.zfill(2)
If need create MultiIndex DataFrame
:
df = pd.read_csv(file).set_index(['event', 'day', 'enrollment_id'])
Upvotes: 1
Reputation: 210832
You can use this function.
Demo:
# first copy the file contents into clipboard
In [184]: df = read_clipboard_mi(index_names_row=1)
In [185]: df
Out[185]:
frequency
event day enrollment_id
access 01 14333 2
14343 14
14391 1
14469 1
35220 9
35307 2
35442 2
... ...
wiki 02 148426 1
03 60416 1
96487 2
96502 2
96505 1
96513 2
2014-07-24 200888 1
[26 rows x 1 columns]
In [186]: df.index
Out[186]:
MultiIndex(levels=[['access', 'wiki'], ['01', '02', '03', '2014-07-24'], [14320, 14321, 14322, 14324, 14325, 14333, 14343, 14391, 14469, 352
20, 35307, 35442, 35517, 47872, 47875, 60416, 61263, 76933, 96487, 96502, 96505, 96513, 111900, 140486, 148426, 200888]],
labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1
, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3], [5, 6, 7, 8, 9, 10, 11, 16, 23, 0, 1, 2, 3, 4, 12, 17, 22, 13, 14, 24, 15, 18, 19, 20, 21, 25]],
names=['event', 'day', 'enrollment_id'])
PS you can easily adopt this function to read from a file instead of the clipboard...
Upvotes: 2