pragmatic learner
pragmatic learner

Reputation: 457

Convert data type of multiple columns with for loop

I have a 21840x39 data frame. A few of my columns are numerically valued and I want to make sure they are all in the same data type (which I want to be a float).

Instead of naming all the columns out and converting them:

df[['A', 'B', 'C', '...]] = df[['A', 'B', 'C', '...]].astype(float)

Can I do a for loop that will allow me to say something like " convert to float from column 18 to column 35"

I know how to do one column: df['A'] = df['A'].astype(float)

But how can I do multiple columns? I tried with list slicing within a loop but couldn't get it right.

Upvotes: 3

Views: 3220

Answers (2)

MDR
MDR

Reputation: 2670

Tweaked @jezrael code as typing in column names (I feel) is a good option.

import pandas as pd
import numpy as np

np.random.seed(2020)
df = pd.DataFrame(np.random.randint(10, size=(3, 18)),
                  columns=list('abcdefghijklmnopqr')).astype(str)

print(df)

columns = list(df.columns)

#change the first and last column names below as required
df = df.astype(dict.fromkeys(
    df.columns[columns.index('h'):(columns.index('o')+1)], float))

print (df)

Leaving the original answer below here but note: Never loop in pandas if vectorized alternatives exist

If I had a dataframe and wanted to change columns 'col3' to 'col5' (human readable names) to floats I could...

import pandas as pd
import re

df = pd.read_csv('dummy_data.csv')

df

enter image description here

columns = list(df.columns)

#change the first and last column names below as required
start_column = columns.index('col3')
end_column   = columns.index('col5')

for index, col in enumerate(columns):
    if (start_column <= index) & (index <= end_column):
        df[col] = df[col].astype(float)
df

enter image description here

...by just changing the column names. Perhaps it's easier to work in column names and 'from this one' and 'to that one' (inclusive).

Upvotes: 1

jezrael
jezrael

Reputation: 863411

First idea is convert selected columns, python counts from 0, so for 18 to 36 columns use:

df.iloc[:, 17:35] = df.iloc[:, 17:35].astype(float)

If not working (because possible bug) use another solution:

df = df.astype(dict.fromkeys(df.columns[17:35], float))

Sample - convert 8 to 15th columns:

np.random.seed(2020)
df = pd.DataFrame(np.random.randint(10, size=(3, 18)),
                  columns=list('abcdefghijklmnopqr')).astype(str)
print (df)
   a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r
0  0  8  3  6  3  3  7  8  0  0  8  9  3  7  2  3  6  5
1  0  4  8  6  4  1  1  5  9  5  6  6  6  5  4  6  4  2
2  3  4  7  1  4  9  3  2  0  9  1  2  7  1  0  2  8  8

df = df.astype(dict.fromkeys(df.columns[7:15], float))
print (df)
   a  b  c  d  e  f  g    h    i    j    k    l    m    n    o  p  q  r
0  0  8  3  6  3  3  7  8.0  0.0  0.0  8.0  9.0  3.0  7.0  2.0  3  6  5
1  0  4  8  6  4  1  1  5.0  9.0  5.0  6.0  6.0  6.0  5.0  4.0  6  4  2
2  3  4  7  1  4  9  3  2.0  0.0  9.0  1.0  2.0  7.0  1.0  0.0  2  8  8

Upvotes: 5

Related Questions