Reputation: 7894
I have a CSV file that looks like this:
Date,Time,Mood,Tags,Medications,Notes
"Jul 25, 2018",9:41 PM,8,,,"",
"Jul 26, 2018",10:05 AM,4,,,"",
"Jul 26, 2018",12:00 PM,3,,,"",
"Jul 26, 2018",7:00 PM,8,,,"",
"Jul 27, 2018",12:01 PM,8,,,"",
I run the following code:
import pandas as pd
df = pd.read_csv("./data/MoodLog_2018_09_14.csv",
dtype={'Date': str, 'Time': str, 'Mood': str, 'Tags': str,
'Medications': str, 'Notes': str})
print(df['Time'].head(5))
and it prints the following:
Jul 25, 2018 8
Jul 26, 2018 4
Jul 26, 2018 3
Jul 26, 2018 8
Jul 27, 2018 8
Name: Time, dtype: object
It's including the Mood
column in the Time
column.
Why is that?
Upvotes: 1
Views: 138
Reputation: 59579
The issue is with your rows having a trailing ,
, while the header does not. Change the header to:
Date,Time,Mood,Tags,Medications,Notes,
, and you will get an extra column which you can then drop.
test.csv
Date,Time,Mood,Tags,Medications,Notes,
"Jul 25, 2018",9:41 PM,8,,,"",
"Jul 26, 2018",10:05 AM,4,,,"",
"Jul 26, 2018",12:00 PM,3,,,"",
"Jul 26, 2018",7:00 PM,8,,,"",
"Jul 27, 2018",12:01 PM,8,,,"",
df = pd.read_csv("test.csv",
dtype={'Date': str, 'Time': str, 'Mood': str, 'Tags': str,
'Medications': str, 'Notes': str}).iloc[:, :-1]
df
Date Time Mood Tags Medications Notes
0 Jul 25, 2018 9:41 PM 8 NaN NaN NaN
1 Jul 26, 2018 10:05 AM 4 NaN NaN NaN
2 Jul 26, 2018 12:00 PM 3 NaN NaN NaN
3 Jul 26, 2018 7:00 PM 8 NaN NaN NaN
4 Jul 27, 2018 12:01 PM 8 NaN NaN NaN
Upvotes: 1