Reputation: 366
How can I apply for each cell of my dataframe this conditional (check if I have a string and if its found, replace with a 0) in an easier way?
for col in data.columns:
for day in data.index:
if (type(data_input_df.loc[day, col]) == 'str'):
data_input_df[col][day] = 0
The data dataframe is:
Log 1233.de ... asdad.w.1 fff.op
Tagname ...
01/06/2008 00:00 ON 343.04 ... 73.75 79.03
02/06/2008 00:00 ON 332.31 ... 73.71 79.21
03/06/2008 00:00 ON 339.25 ... 75.77 80.11
04/06/2008 00:00 ON 353.25 ... 76.47 79.75
05/06/2008 00:00 ON 353.44 ... 77.35 82.48
Which is uploaded as:
dat = pd.read_csv('data.csv', header = 0, index_col = "Tagname")
data = pd.DataFrame(data)
Upvotes: 1
Views: 286
Reputation: 3010
You could coerce the dataframe to numeric datatype then fill NaN with zeroes.
df = pd.to_numeric(df, errors='coerce').fillna(0)
Upvotes: 0
Reputation: 863031
First select all columns filled by objects, here because data from read_csv
are all objects strings
, create dictionary and set 0
in DataFrame.assign
:
d = dict.fromkeys(df.select_dtypes(object).columns, 0)
df = df.assign(**d)
print (df)
Log 1233.de asdad.w.1 fff.op
Tagname
01/06/2008 00:00 0 343.04 73.75 79.03
02/06/2008 00:00 0 332.31 73.71 79.21
03/06/2008 00:00 0 339.25 75.77 80.11
04/06/2008 00:00 0 353.25 76.47 79.75
05/06/2008 00:00 0 353.44 77.35 82.48
What if the dtypes of all columns are all object because of some string appearing in later rows?
Then if not missing values use to_numeric
for convert all columns to numeric with errors='coerce'
for missing values if not numbers and last only replace all missing values to 0
:
df = df.apply(pd.to_numeric, errors='coerce').fillna(0)
If possible missing values and need not change them is not possible use fillna
(because replaced all NaNs), so is used DataFrame.mask
with chained tested original data with new one:
df = df.apply(pd.to_numeric, errors='coerce').mask(lambda x: x.isna() & df.notna(), 0)
Upvotes: 3