Reputation: 39
I have text file data.txt like this:
Name male female bayu 1 0 jonson 1 0 anna 0 1
i have try to convert this file using pandas python
import pandas as pd
df = pd.read_fwf('data.txt')
df.to_csv('data.csv')
I want to get results data.csv like this:
Gender name bayu male jonson male Anna Female
what should i do?
Upvotes: 1
Views: 63
Reputation: 8826
Simply can be done with replace and rename ..
>>> df
Name female male
0 bayu 0 1
1 jason 0 1
2 anna 1 0
>>> df.replace({ 0: "male", 1: "Female"}).drop(columns={'male'}).rename(columns={'female': 'Gender'})
Name Gender
0 bayu male
1 jason male
2 anna Female
saving dataframe to csv..
>>> df = df.replace({ 0: "male", 1: "Female"}).drop(columns {'male'}).rename(columns={'female': 'Gender'})
>>> df.to_csv('data.csv', index=False)
Your CSV will be looks like below with comma delimited ..
$ cat data.csv
Name,Gender
bayu,male
jason,male
anna,Female
Or, Just with idxmax
and df.assign
to assign back the values to nw columns altogether.
>>> df.assign(Gender = df[['male','female']].idxmax(axis=1)).drop(columns={'male', 'female'})
Name Gender
0 bayu male
1 jason male
2 anna female
2) Another way around creating a function and then using apply
method.
def setval(col):
col = str(col)
if col.startswith('0'):
return 'Male'
if col.startswith('1'):
return 'Female'
>>> df.assign(Gender = df[ 'female'].apply(setval)).drop({"female", "male"}, axis=1)
Name Gender
0 bayu Male
1 jason Male
2 anna Female
You can create a dict and then map the dict against one of the column assign the new values to a new column with df.assign
further dropping the un-desired columns.
Here, yo can use either map
or replace
method at your ease ..
>>> newVal = {0:'Male', 1: 'Female'} # dictionary paired with key, values
a) in the below example axis=1
represents the column which is newer method.
>>> df.assign(Gender=df.female.map(newVal)).drop({'male', 'female'},axis=1)
Name Gender
0 bayu Male
1 jason Male
2 anna Female
b) while in the below example we are explicitly mentioning it to be applied on the column which is old method.
>>> df.assign(Gender=df.female.map(newVal)).drop(columns={'male', 'female'})
Name Gender
0 bayu Male
1 jason Male
2 anna Female
With .replace
>>> df.assign(Gender=df.female.replace(newVal)).drop({'male', 'female'},axis=1)
Name Gender
0 bayu Male
1 jason Male
2 anna Female
Upvotes: 1
Reputation: 12417
You can use idxmax()
before to save to csv
:
df['Gender'] = df[['male', 'female']].idxmax(axis=1)
df = df.drop(columns=['male', 'female'])
Upvotes: 1
Reputation: 3910
import numpy as np
df['Gender'] = np.where(df['male'].eq(1), 'male', 'female')
df = df.drop(columns={'male', 'female'})
df.to_csv('data.csv', index=False)
Name Gender
0 bayu male
1 jonson male
2 anna female
Upvotes: 1