Paul McBurney
Paul McBurney

Reputation: 251

How to change pandas dataframe strings into integers?

Here I have a csv file that I am attempting to turn into all integer values, but I am not sure how to do it. I have looked at other posts but they don't seem to be working.

Here is my csv:

X1,X2,X3,X4,X5,X6,X7,X8,X9,PosNeg
x,x,x,x,o,o,x,o,o,positive
x,x,x,x,o,o,o,x,o,positive
x,x,x,x,o,o,o,o,x,positive
x,x,x,x,o,o,o,b,b,positive
x,x,x,x,o,o,b,o,b,positive
x,x,x,x,o,o,b,b,o,positive
x,x,x,x,o,b,o,o,b,positive
x,x,x,x,o,b,o,b,o,positive
x,x,x,x,o,b,b,o,o,positive
x,x,x,x,b,o,o,o,b,positive
x,x,x,x,b,o,o,b,o,positive
x,x,x,x,b,o,b,o,o,positive
x,x,x,o,x,o,x,o,o,positive
x,x,x,o,x,o,o,x,o,positive
x,x,x,o,x,o,o,o,x,positive
x,x,x,o,x,o,o,b,b,positive
x,x,x,o,x,o,b,o,b,positive
x,x,x,o,x,o,b,b,o,positive
x,x,x,o,x,b,o,o,b,positive
x,x,x,o,x,b,o,b,o,positive
x,x,x,o,x,b,b,o,o,positive

I would like to transform it into something like this:

1,1,1,1,1,1,0,0,0,1

Thank you.

Upvotes: 1

Views: 143

Answers (4)

David Nehme
David Nehme

Reputation: 21572

The pandas replace function accepts a dictionary that maps from one value to another. However, since you are starting with a Data Frame of strings and you want integers, you should the columns to a numeric datatype with the method astype.

df.replace({'x': 1, 
            'o': 0,
            'b': -1,  
            'positive': 2, 
            'negative': -2}).astype(np.int16)

Upvotes: 0

97-108-101-120
97-108-101-120

Reputation: 11

replace

For the whole DataFrame with chained replace

df = df.replace('x', 1).replace('o', 0)

Or you can pass a dict to replace

df.replace({'x': 1, 'o': 0})

apply

For a single column

df['x1'].apply(lambda l: 0 if l == 'o' else 1) 

applymap

For all columns (if the results should be binary [0 or 1])

df.applymap(lambda l: 0 if l == 'o' else 1)

Upvotes: 1

Philip Z.
Philip Z.

Reputation: 226

You can use the map function with a simple lambda function like this:

df['X1'].map(lambda x:  
if x == 'o':
  return 0
if x == 'x':
  return 1
)

Upvotes: 1

Max Kaha
Max Kaha

Reputation: 922

For this you can use the replace() function of your pandas Dataframe to first replace all "x" values with 1 and then afterwards "o" with 0 as such:

>>> df = pd.read_csv(r"<PATH>")
>>> df
   1  2  3
0  x  x  o
1  x  x  o
2  o  o  x
3  x  o  x

>>> df = df.replace("x", 1)
>>> df
   1  2  3
0  1  1  o
1  1  1  o
2  o  o  1
3  1  o  1

>>> df = df.replace("o", 0)
>>> df
   1  2  3
0  1  1  0
1  1  1  0
2  0  0  1
3  1  0  1

Pandas Documentation

Upvotes: 1

Related Questions