How to replace a specific value in an entire CSV file with pandas?

Question

I have a large csv file that I import to pandas. The first row is the header of columns (even if it is values). My goal is to replace all values of -999.999 to 0.

I have already tried to make my first row as header with df.colums = df.iloc[0] but without success. To replace the values I tried to use df.replace but it does not seem to find the -999.99 values.

import pandas as pd

df = pd.read_csv('Test.csv') # importer le dossier CSV 

df.columns = df.iloc[0]
df.replace(-999.999, 0, inplace=True)

df.to_csv('test2.csv')

The CSV file looks like this where the first column (0-1-2...) was automatically created and not part of the original data.

  -120;-119.7;-119.4;-119.1;-118.8;-118.5;-118.2;-117.9
0  -999.999;-999.999;-999.999;20.1;1.2;89.5;-999....   
1  -999.999;-999.999;4.6;25.6;8.9;-999.999;-999.9...   
2  -999.999;-999.999;8.66;12.6;7.6;-999.999;13.98...   
3  -999.999;-999.999;9.12;6.98;0.6;45.3;12.4;-999...   
4  -999.999;-999.999;13.36;4.56;3.9;2.36;-999.999...   
5  -999.999;-999.999;-999.999;21;45.2;-999.999;-9...

It_is_Chris · Accepted Answer

First problem is that you need to specify the separator the second problem with replace is that it will be converted to float64 so using replace(-999.999, 0) will not work.

from io import StringIO

s = '''-120;-119.7;-119.4;-119.1;-118.8;-118.5;-118.2;-117.9
-999.999;-999.999;-999.999;20.1;1.2;89.5;-999
-999.999;-999.999;4.6;25.6;8.9;-999.999;-999.9
-999.999;-999.999;8.66;12.6;7.6;-999.999;13.98
-999.999;-999.999;9.12;6.98;0.6;45.3;12.4;-999
-999.999;-999.999;13.36;4.56;3.9;2.36;-999.999
-999.999;-999.999;-999.999;21;45.2;-999.999;-9'''

df = pd.read_csv(StringIO(s), sep=';')

df = df.replace(df.iloc[0,0], 0) # specify a cell with the value you want to replace

or you can use a float64 number:

df.replace(-999.9989999999999, 0)

   -120  -119.7  -119.4  -119.1  -118.8  -118.5  -118.2  -117.9
0   0.0     0.0    0.00   20.10     1.2   89.50 -999.00     NaN
1   0.0     0.0    4.60   25.60     8.9    0.00 -999.90     NaN
2   0.0     0.0    8.66   12.60     7.6    0.00   13.98     NaN
3   0.0     0.0    9.12    6.98     0.6   45.30   12.40  -999.0
4   0.0     0.0   13.36    4.56     3.9    2.36    0.00     NaN
5   0.0     0.0    0.00   21.00    45.2    0.00   -9.00     NaN

For an actual file StringIO will be replaced with the file path.

The first column, as you call it, is the index if you do not want it in the output just df.to_csv('file.csv', index=False)

How to replace a specific value in an entire CSV file with pandas?

Answers (2)

Related Questions