Reputation: 1992
I have to merge two dataframes:
df1
company,standard
tata,A1
cts,A2
dell,A3
df2
company,return
tata,71
dell,78
cts,27
hcl,23
I have to unify both dataframes to one dataframe. I need output like:
company,standard,return
tata,A1,71
cts,A2,27
dell,A3,78
Upvotes: 109
Views: 255297
Reputation: 862
The following solution worked perfect for me on a MSA(microservice) architecture.
pip install mysql-connector-python SQLAlchemy pandas
import pandas as pd
from main.db.sql_alchemy import (
engine_siaw,
engine_sip,
)
altas_periodo_id = 10
df_altas = pd.read_sql(
SQL_ALTAS_EST.format(altas_periodo_id=altas_periodo_id),
con=engine_sip,
)
df_tiendas = pd.read_sql(SQL_ALTAS_TIENDAS, con=engine_siaw)
df_empresas = pd.read_sql(SQL_ALTAS_EMPRESAS, con=engine_siaw)
df_altas_tiendas = df_altas.merge(
df_tiendas, left_on="tienda", right_on="nombre"
)
df_file = df_altas_tiendas.merge(
df_empresas, left_on="empresa", right_on="razon_social"
)
The SQL variables containing the raw SQL:
SQL_ALTAS_EST = """
select
td.codigo as tipo_doc,
ad.numero_documento as numero_doc,
cna.codigo as nacionalidad,
ad.tienda,
ad.empresa
from sip_registroaltadetalle as ad
inner join sip_personal sp on sp.codigo = ad.codigo
left join sip_tipodocumentoidentidad as td on td.pkid = ad.tipo_documento
left join sip_multitabladetalle as cna on cna.pkid = sp.nacionalidad_id
where ad.codigo is not null and ad.parent_id = '{altas_periodo_id}'
""".strip()
SQL_ALTAS_TIENDAS = """
select codigo as codigo_establecimiento, nombre from catalogo_tienda ct
""".strip()
SQL_ALTAS_EMPRESAS = """
select razon_social, ruc from catalogo_empresa ce
""".strip()
the database connections:
from django.conf import settings
from sqlalchemy import create_engine
cn_sip = f"mysql+mysqlconnector://{settings.DB_USER}:{settings.DB_PASSWORD}@{settings.DB_HOST}:{settings.DB_PORT}/{settings.DB_NAME}" # noqa
cn_siaw = f"mysql+mysqlconnector://{settings.SIAW_DB_USER}:{settings.SIAW_DB_PASSWORD}@{settings.SIAW_DB_HOST}:{settings.SIAW_DB_PORT}/{settings.SIAW_DB_NAME}" # noqa
engine_siaw = create_engine(cn_siaw)
engine_sip = create_engine(cn_sip)
I really hope u find it helpful.
Upvotes: -1
Reputation: 1316
In order to successfully merge two data frames based on common column(s), the dtype for common column(s) in both data frames must be the same! dtype for a column can be changed by:
df['commonCol'] = df['commonCol'].astype(int)
Upvotes: -5
Reputation: 863701
Use merge
:
print (pd.merge(df1, df2, on='company'))
Sample:
print (df1)
company standard
0 tata A1
1 cts A2
2 dell A3
print (df2)
company return
0 tata 71
1 dell 78
2 cts 27
3 hcl 23
print (pd.merge(df1, df2, on='company'))
company standard return
0 tata A1 71
1 cts A2 27
2 dell A3 78
Upvotes: 184