Reputation: 65
I have 3 columns namely Models(should be taken as index), Accuracy without normalization, Accuracy with normalization (zscore, minmax, maxabs, robust) and these are required to be created as:
------------------------------------------------------------------------------------
| Models | Accuracy without normalization | Accuracy with normalization |
| | |-----------------------------------|
| | | zscore | minmax | maxabs | robust |
------------------------------------------------------------------------------------
dfmod-> Models column
dfacc-> Accuracy without normalization
dfacc1-> Accuracy with normalization - zscore
dfacc2-> Accuracy with normalization - minmax
dfacc3-> Accuracy with normalization - maxabs
dfacc4-> Accuracy with normalization - robust
dfout=pd.DataFrame({('Accuracy without Normalization'):{dfacc},
('Accuracy using Normalization','zscore'):{dfacc1},
('Accuracy using Normalization','minmax'):{dfacc2},
('Accuracy using Normalization','maxabs'):{dfacc3},
('Accuracy using Normalization','robust'):{dfacc4},
},index=dfmod
)
I was trying to do something like this but i can't figure out any further
Test data:
qda 0.6333 0.6917 0.5917 0.6417 0.5833
svm 0.5333 0.6917 0.5333 0.575 0.575
lda 0.5333 0.6583 0.5333 0.5667 0.5667
lr 0.5333 0.65 0.4917 0.5667 0.5667
dt 0.5333 0.65 0.4917 0.5667 0.5667
rc 0.5083 0.6333 0.4917 0.525 0.525
nb 0.5 0.625 0.475 0.5 0.4833
rfc 0.5 0.625 0.4417 0.4917 0.4583
knn 0.3917 0.6 0.4417 0.4833 0.45
et 0.375 0.5333 0.4333 0.4667 0.45
dc 0.375 0.5333 0.4333 0.4667 0.425
qds 0.3417 0.5333 0.4 0.4583 0.3667
lgt 0.3417 0.525 0.3917 0.45 0.3583
lt 0.2333 0.45 0.3917 0.4167 0.3417
These are values for respective subcolumns in order specified in the table above
Upvotes: 6
Views: 3379
Reputation: 6295
There's a dirty way to do this, I'll write about it till someone answers with a better idea. Here we go:
import pandas as pd
# I assume that you can read raw data named test.csv by pandas and
# set header = None cause you mentioned the Test data without any headers, so:
df = pd.read_csv("test.csv", header = None)
# Then define preferred Columns!
MyColumns = pd.MultiIndex.from_tuples([("Models" , ""),
("Accuracy without normalization" , ""),
("Accuracy with normalization" , "zscore"),
("Accuracy with normalization" , "minmax"),
("Accuracy with normalization" , "maxabs"),
("Accuracy with normalization" , "robust")])
# Create new DataFrame with specified Columns, after this you should pass values
New_DataFrame = pd.DataFrame(df , columns = MyColumns)
# a loop for passing values
for item in range(len(MyColumns)):
New_DataFrame.loc[: , MyColumns[item]] = df.iloc[: , item]
This gives me:
after all, if you want to set Models
as the index of New_DataFrame
, You can continue with:
New_DataFrame.set_index(New_DataFrame.columns[0][0] , inplace=True)
New_DataFrame
This gives me:
Upvotes: 3