How to replace values of each cell on a dataframe without looping it

Question

How can I apply for each cell of my dataframe this conditional (check if I have a string and if its found, replace with a 0) in an easier way?

for col in data.columns:
    for day in data.index:
        if (type(data_input_df.loc[day, col]) == 'str'):
            data_input_df[col][day] = 0

The data dataframe is:

                           Log      1233.de ...      asdad.w.1    fff.op
Tagname                                     ...                        
01/06/2008 00:00            ON      343.04  ...        73.75      79.03
02/06/2008 00:00            ON      332.31  ...        73.71      79.21
03/06/2008 00:00            ON      339.25  ...        75.77      80.11
04/06/2008 00:00            ON      353.25  ...        76.47      79.75
05/06/2008 00:00            ON      353.44  ...        77.35      82.48

Which is uploaded as:

dat =  pd.read_csv('data.csv', header = 0, index_col = "Tagname")
data = pd.DataFrame(data)

jezrael · Accepted Answer

First select all columns filled by objects, here because data from read_csv are all objects strings, create dictionary and set 0 in DataFrame.assign:

d = dict.fromkeys(df.select_dtypes(object).columns, 0)
df = df.assign(**d)

print (df)
                  Log  1233.de  asdad.w.1  fff.op
Tagname                                          
01/06/2008 00:00    0   343.04      73.75   79.03
02/06/2008 00:00    0   332.31      73.71   79.21
03/06/2008 00:00    0   339.25      75.77   80.11
04/06/2008 00:00    0   353.25      76.47   79.75
05/06/2008 00:00    0   353.44      77.35   82.48

What if the dtypes of all columns are all object because of some string appearing in later rows?

Then if not missing values use to_numeric for convert all columns to numeric with errors='coerce' for missing values if not numbers and last only replace all missing values to 0:

df = df.apply(pd.to_numeric, errors='coerce').fillna(0)

If possible missing values and need not change them is not possible use fillna (because replaced all NaNs), so is used DataFrame.mask with chained tested original data with new one:

df = df.apply(pd.to_numeric, errors='coerce').mask(lambda x: x.isna() & df.notna(), 0)

How to replace values of each cell on a dataframe without looping it

Answers (2)

Related Questions