Reputation: 251
Here I have a csv file that I am attempting to turn into all integer values, but I am not sure how to do it. I have looked at other posts but they don't seem to be working.
Here is my csv:
X1,X2,X3,X4,X5,X6,X7,X8,X9,PosNeg
x,x,x,x,o,o,x,o,o,positive
x,x,x,x,o,o,o,x,o,positive
x,x,x,x,o,o,o,o,x,positive
x,x,x,x,o,o,o,b,b,positive
x,x,x,x,o,o,b,o,b,positive
x,x,x,x,o,o,b,b,o,positive
x,x,x,x,o,b,o,o,b,positive
x,x,x,x,o,b,o,b,o,positive
x,x,x,x,o,b,b,o,o,positive
x,x,x,x,b,o,o,o,b,positive
x,x,x,x,b,o,o,b,o,positive
x,x,x,x,b,o,b,o,o,positive
x,x,x,o,x,o,x,o,o,positive
x,x,x,o,x,o,o,x,o,positive
x,x,x,o,x,o,o,o,x,positive
x,x,x,o,x,o,o,b,b,positive
x,x,x,o,x,o,b,o,b,positive
x,x,x,o,x,o,b,b,o,positive
x,x,x,o,x,b,o,o,b,positive
x,x,x,o,x,b,o,b,o,positive
x,x,x,o,x,b,b,o,o,positive
I would like to transform it into something like this:
1,1,1,1,1,1,0,0,0,1
Thank you.
Upvotes: 1
Views: 143
Reputation: 21572
The pandas replace function accepts a dictionary that maps from one value to another. However, since you are starting with a Data Frame of strings and you want integers, you should the columns to a numeric datatype with the method astype.
df.replace({'x': 1,
'o': 0,
'b': -1,
'positive': 2,
'negative': -2}).astype(np.int16)
Upvotes: 0
Reputation: 11
replace
For the whole DataFrame with chained replace
df = df.replace('x', 1).replace('o', 0)
Or you can pass a dict
to replace
df.replace({'x': 1, 'o': 0})
apply
For a single column
df['x1'].apply(lambda l: 0 if l == 'o' else 1)
applymap
For all columns (if the results should be binary [0 or 1])
df.applymap(lambda l: 0 if l == 'o' else 1)
Upvotes: 1
Reputation: 226
You can use the map function with a simple lambda function like this:
df['X1'].map(lambda x:
if x == 'o':
return 0
if x == 'x':
return 1
)
Upvotes: 1
Reputation: 922
For this you can use the replace() function of your pandas Dataframe to first replace all "x" values with 1 and then afterwards "o" with 0 as such:
>>> df = pd.read_csv(r"<PATH>")
>>> df
1 2 3
0 x x o
1 x x o
2 o o x
3 x o x
>>> df = df.replace("x", 1)
>>> df
1 2 3
0 1 1 o
1 1 1 o
2 o o 1
3 1 o 1
>>> df = df.replace("o", 0)
>>> df
1 2 3
0 1 1 0
1 1 1 0
2 0 0 1
3 1 0 1
Upvotes: 1