Reputation: 6940
My code:
raw_data = pd.read_csv("C:/my.csv")
After I ran it to file is loaded but I am getting:
C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\interactiveshell.py:3051: DtypeWarning: Columns (0,79,237,239,241,243,245,247,248,249,250,251,252,253,254,255,256,258,260,262,264) have mixed types. Specify dtype option on import or set low_memory=False. interactivity=interactivity, compiler=compiler, result=result)
Questions:
Sorry, I cannot share the data.
Upvotes: 1
Views: 510
Reputation: 61
Pandas will read all data to memory. If your CSV is large, this may be a tough task.
chunks = []
for chunk in pd.read_csv('desired_file...', chunksize = 1000):
chunks.append(chunk)
df = pd.concat(chunks, ignore_index = True)
This will read the CSV to memory in chunks instead of as bulk.
Upvotes: 1
Reputation: 60
Try to use the parameter dtype for pandas.read_csv
You can find here: Pandas.read_csv
In my CSV, I just transform all the columns in a string, and after the loading of the Dataset, i transform the columns I need in numbers using
DataFrame[Column] = pandas.to_numeric(DataFrame[Column], errors='coerce')
Upvotes: 0
Reputation: 1253
pd.read_csv
has a number of parameters that will give you control over how to treat the different columns.
Without the data it is hard to be specific, so read up on what the the options dtype
or converters
can do.
See the pandas manual for more details.
A first try could be
raw_data = pd.read_csv("C:/my.csv", dtype=str)
This should allow you to read the data and figure out how to set the data type on the columns that really matter.
Upvotes: 1
Reputation: 183
Try these
raw_data = pd.read_csv("C:/my.csv",low_memory=False)
Upvotes: 2