Reputation: 3667
Consider the following:
import pandas as pd
df1 = pd.io.sql.read_frame("select c1,c2,c3 from table1;",mysqldb1)
df2 = pd.io.sql.read_frame("select c1,c4 from table1;",mysqldb2)
df = pd.merge(df1,df2,on='c1',how='inner')
This works most of the time. I believe I am having a problem because in mysqldb1 c1 is a varchar whereas in mysqldb2 c1 is an INTEGER. I can't change this state of affairs. But now, because
df1.c1.dtype != df2.c1.dtype
pd.merge returns an empty dataframe.
I have tried casting, i.e.
df2['c1'] = df2.c1.astype(object)
and then tried the merge, to no avail.
Upvotes: 0
Views: 179
Reputation: 6703
Usually in cases of SQL reads to a dataframe the varchar columns are read as raw strings but pandas lists all string columns as "object."
The quick fix in this situation is to convert the c1
column to string.
df['c1'] = df.c1.astype(string)
This will cast all of your int c1
values to string.
A deeper fix, if you think all c2
values should be cast to integers, would be to write a function that goes row by row casting as int
and explicitly deals with c2
values that have difficulties casting. But it's not entirely needed.
Upvotes: 1