Joey
Joey

Reputation: 934

Editing data in CSV files using Pandas

I have a CSV file with the following data:

Time    Pressure
 0  2.9852.988
 10 2.9882.988
 20 2.9902.990
 30 2.9882.988
 40 2.9852.985
 50 2.9842.984
 60 2.9852.985.....

for some reason the second column is separated by 2 decimal points. I'm trying to create a dataFrame with pandas but cannot proceed without removing the second decimal point. I cannot do this manually as there are thousands of data points in my file. any ideas?

Upvotes: 1

Views: 1239

Answers (1)

EdChum
EdChum

Reputation: 394051

You can call the vectorised str methods to split the string on decimal point, join the result of split but discard the last element, this produces for example a list [2,9852] which you then join with a decimal point:

In [28]:

df['Pressure'].str.split('.').str[:-1].str.join('.')
Out[28]:
0    2.9852
1    2.9882
2    2.9902
3    2.9882
4    2.9852
5    2.9842
6    2.9852
Name: Pressure, dtype: object

If you want to convert the string to a float then call astype:

In [29]:

df['Pressure'].str.split('.').str[:-1].str.join('.').astype(np.float64)
Out[29]:
0    2.9852
1    2.9882
2    2.9902
3    2.9882
4    2.9852
5    2.9842
6    2.9852
Name: Pressure, dtype: float64

Just remember to assign the conversion back to the original df:

df['Pressure'] = df['Pressure'].str.split('.').str[:-1].str.join('.').astype(np.float64)

Upvotes: 2

Related Questions