Jai
Jai

Reputation: 13

Change Multiple Columns dtypes with multiple different dtypes using loop in pandas

I have a two list one list contain default dtypes of the column in dataframe and second list contain changing dtypes list how to use which apporach to handel this problem .suppose Columns Name is ['NameID','Age','Address','DOB']

default_dty=['int64',int64,'object','object']

When we check my default dtype is

NameId  int64
Age     int64
Address object
DOB     object

required_dty=['object',object','object','date']

What output I required

NameId  object
Age     object
Address object
DOB     date

I want changes using loop because I have 30 columns in my dataframe,so I don't do the manually my code is :

for col,rdt in zip(cname.columns,req_dty):
    # data[colu]=data[colu].astype(rdt)
    if 'date' in rdt:    
        a=redt.index('date')
        data[c[a]]=pd.to_datetime(data[c[a]],unit='ns')
    else:
        data[colu]=data[colu].astype(rdt)

But It's not working. Pls Help.....!

Upvotes: 1

Views: 666

Answers (2)

jezrael
jezrael

Reputation: 863431

You can create list of columns for convert to datetimes and dictionary for convert to another dtypes:

df = pd.DataFrame({'NameId':list('abc'),
                   'Age':[20,'33','ND'],
                   'Address':list('erd'),
                   'DOB':[1349720105] * 3})
print (df.dtypes)
NameId     object
Age        object
Address    object
DOB         int64
dtype: object

required_dty=['object','int','object','date']

for col,rdt in zip(df.columns,required_dty):
    if 'date' in rdt:    
        df[col]=pd.to_datetime(df[col],unit='ns')
        
    elif 'int'  in rdt:   
        try:
            df[col]=df[col].astype(rdt)
        except ValueError:
            df[col]=pd.to_numeric(df[col], errors='coerce').fillna(0).astype(rdt)
    elif 'float'  in rdt:   
        try:
            df[col]=df[col].astype(rdt)
        except ValueError:
            df[col]=pd.to_numeric(df[col], errors='coerce')
    else:
        df[col]=df[col].astype(rdt)

print(df.dtypes)
NameId             object
Age                 int32
Address            object
DOB        datetime64[ns]
dtype: object

Upvotes: 1

YOLO
YOLO

Reputation: 21749

You can do without an explicit loop:

# get column names expects date column
no_date_cols = df.columns.difference(['date'])

# set those cols as object type
df[no_date_cols] = df[no_date_cols].astype(object)

df['date'] = pd.to_datetime(df['date'], unit='ns')

Upvotes: 0

Related Questions