Reputation: 47
I am trying to do some kind of reverse transpose where the ID(ISIN) becomes duplicates, but where the feature 'Period' defines the time period and the value-features goes from 3 features to the same feature. How do I get from dfs to dfs2 in Python?
dfs = pd.DataFrame({
'ISIN': [
'A', 'B', 'C'
],
'Std3y': [
0.10, 0.11, 0.15
],
'Std5y': [
0.14, 0.10, 0.18
],
'Std8y': [
0.17, 0.19, 0.11
],
})
dfs
dfs2 = pd.DataFrame({
'ISIN': [
'A', 'A', 'A',
'B', 'B', 'B',
'C', 'C', 'C'
],
'Period': [
'3y', '5y', '8y',
'3y', '5y', '8y',
'3y', '5y', '8y'
],
'Std': [
0.10, 0.14, 0.17,
0.11, 0.10, 0.19,
0.15, 0.18, 0.11
]
})
dfs2
Upvotes: 1
Views: 64
Reputation: 862511
Use set_index
with unstack
and some data cleaning by swaplevel
, sort_index
and reset_index
:
df = dfs.set_index('ISIN')
df.columns = df.columns.str[3:]
df = (df.unstack()
.swaplevel(0,1)
.sort_index()
.rename_axis(['ISIN','Period'])
.reset_index(name='Std'))
print (df)
ISIN Period Std
0 A 3y 0.10
1 A 5y 0.14
2 A 8y 0.17
3 B 3y 0.11
4 B 5y 0.10
5 B 8y 0.19
6 C 3y 0.15
7 C 5y 0.18
8 C 8y 0.11
Upvotes: 0
Reputation: 164623
You can use pd.melt
to "unpivot" your dataframe and then use string slicing:
res = pd.melt(dfs, id_vars='ISIN', value_vars=dfs.columns[1:].tolist())
res['variable'] = res['variable'].str[3:]
print(res)
ISIN variable value
0 A 3y 0.10
1 B 3y 0.11
2 C 3y 0.15
3 A 5y 0.14
4 B 5y 0.10
5 C 5y 0.18
6 A 8y 0.17
7 B 8y 0.19
8 C 8y 0.11
Upvotes: 1