Reputation: 723
I have a strange issue on pandas.read_csv function. I exported my dataframe into a csv, but when I re-imported the same csv, the data that has been imported back does not work when I try to merge(The merge shows all the data on the left and none that I have tried to merge it with). If I use the original data before it was exported to the csv, it works completely fine.(The merge was perfect).
df = df.values_list('id','teacher_id','uniquecount','nonuniquecount','msgcount','ordercount','date','updated','timestamp', flat=False)
#inserting the collected data into a dateframe for manipulation
df = pd.DataFrame(list(df))
#giving the dataframe column names
df.columns = ['id','teacher_id','uniquecount','nonuniquecount','msgcount','ordercount','date','updated','timestamp']
df = df[['id','teacher_id','uniquecount','nonuniquecount','msgcount','ordercount','date']]
#rename required columns
df.rename(columns={'uniquecount':'Unique Views','nonuniquecount':'Views','msgcount':'Messages','ordercount':'Orders'}, inplace=True)
print df
print df.dtypes
# exporting df out to a csv
# df.to_csv('test.csv', header=True)
# importing the df back from a csv
df = pd.read_csv('test.csv', index_col=0)
print df
print df.dtypes
#insert dates
numdays = 14
base = datetime.datetime.today().date()
date_list = [base - datetime.timedelta(days=x) for x in range(0, numdays)]
dates = pd.DataFrame(date_list)
dates.columns = ['date']
#merge the complete dates with the dateframe
df = pd.merge(dates ,df , on=['date'] , how='left')
# print df
I have checked and compared that the dataframes look exactly the same before export and after importing from the csv.(I printed the output twice, once before export and one after) I have also checked and the datetypes are all the same.
I need to export the csv to work with an external environment because I cant attach my local database.
attached a copy of the cmdline print which shows that both dataframes are exactly similar
attached below is a sample of my exported csv
,id,teacher_id,Unique Views,Views,Messages,Orders,date
0,47,31,1,6,0,0,2017-05-09
1,56,31,1,9,0,0,2017-05-10
2,67,31,2,11,0,0,2017-05-14
3,71,31,3,15,0,0,2017-05-15
4,79,31,3,17,0,0,2017-06-12
5,83,31,3,18,0,1,2017-06-18
Does anyone have any idea on this strange issue?
Upvotes: 1
Views: 203
Reputation: 18208
Before calling merge, try converting both dates using to_datetime
first as referred in answer here
df.date = pd.to_datetime(df.date)
dates.date = pd.to_datetime(dates.date)
#merge the complete dates with the dateframe
df = pd.merge(dates ,df , on=['date'] , how='left')
Upvotes: 1