Pandas - Importing CSV issue - adding and deleting 0 at the end

Question

I want to import a csv dataset. My problem is when I import the base, pandas kinds of try to convert it into something.

Let me explain with numbers. This is more or less how my csv file is.

> Data, Id, Text

>2018-06-11, 17980873.3391, bla bla bla 

>2018-06-11, 17980874.4560, bla bla bla

>2018-06-11, 17980876.8560, bla bla bla

The trouble is when I import it with pd.read_csv. The Id column should be exactly the way it is in csv file (I want to use it as a filter to do searches) . But pandas is returning something like:

When I import with no changes in the structure (pandas transform the column into float automatically)

> Data, Id, Text

>2018-06-11, 17980873.33910, bla bla bla 

>2018-06-11, 17980874.45600, bla bla bla

>2018-06-11, 17980876.85600, bla bla bla

when I import the dataset and transform the id column as type(str):

> Data, Id, Text

>2018-06-11, 17980873.3391, bla bla bla 

>2018-06-11, 17980874.456, bla bla bla

>2018-06-11, 17980876.856, bla bla bla

It is deleting and adding 0. I really don't know how to make pandas import the real number

>17980876.8560

Hope I've made myself understood. I'm still learning how to ask questions here.

Thanks!

Mark Tolonen · Accepted Answer

Set the dtype for the Id to str for no translation.

Given:

Data,Id,Text
2018-06-11,17980873.3391,bla bla bla
2018-06-11,17980874.4560,bla bla bla
2018-06-11,17980876.8560,bla bla bla

Use:

import pandas as pd
data = pd.read_csv('data.csv',dtype={'Id':str})
print(data)

To get:

         Data             Id         Text
0  2018-06-11  17980873.3391  bla bla bla
1  2018-06-11  17980874.4560  bla bla bla
2  2018-06-11  17980876.8560  bla bla bla

This does assume your ID field is intended to be an 8-digit dot 4-digit string and not a floating point value.

Pandas - Importing CSV issue - adding and deleting 0 at the end

Answers (2)

Related Questions