Sai Rajesh
Sai Rajesh

Reputation: 1992

how to merge two data frames based on particular column in pandas python?

I have to merge two dataframes:

df1

company,standard
tata,A1
cts,A2
dell,A3

df2

company,return
tata,71
dell,78
cts,27
hcl,23

I have to unify both dataframes to one dataframe. I need output like:

company,standard,return
tata,A1,71
cts,A2,27
dell,A3,78

Upvotes: 109

Views: 255297

Answers (4)

Manuel Lazo
Manuel Lazo

Reputation: 862

The following solution worked perfect for me on a MSA(microservice) architecture.

Required Packages

pip install mysql-connector-python SQLAlchemy pandas

Solution

import pandas as pd
from main.db.sql_alchemy import (
    engine_siaw,
    engine_sip,
)

altas_periodo_id = 10
df_altas = pd.read_sql(
    SQL_ALTAS_EST.format(altas_periodo_id=altas_periodo_id),
    con=engine_sip,
)
df_tiendas = pd.read_sql(SQL_ALTAS_TIENDAS, con=engine_siaw)
df_empresas = pd.read_sql(SQL_ALTAS_EMPRESAS, con=engine_siaw)

df_altas_tiendas = df_altas.merge(
    df_tiendas, left_on="tienda", right_on="nombre"
)
df_file = df_altas_tiendas.merge(
    df_empresas, left_on="empresa", right_on="razon_social"
)

The SQL variables containing the raw SQL:

SQL_ALTAS_EST = """
select
    td.codigo as tipo_doc,
    ad.numero_documento as numero_doc,
    cna.codigo as nacionalidad,
    ad.tienda,
    ad.empresa
from sip_registroaltadetalle as ad
    inner join sip_personal sp on sp.codigo = ad.codigo
    left join sip_tipodocumentoidentidad as td on td.pkid = ad.tipo_documento
    left join sip_multitabladetalle as cna on cna.pkid = sp.nacionalidad_id
where ad.codigo is not null and ad.parent_id = '{altas_periodo_id}'
""".strip()

SQL_ALTAS_TIENDAS = """
select codigo as codigo_establecimiento, nombre from catalogo_tienda ct
""".strip()

SQL_ALTAS_EMPRESAS = """
select razon_social, ruc from catalogo_empresa ce
""".strip()

the database connections:

from django.conf import settings
from sqlalchemy import create_engine

cn_sip = f"mysql+mysqlconnector://{settings.DB_USER}:{settings.DB_PASSWORD}@{settings.DB_HOST}:{settings.DB_PORT}/{settings.DB_NAME}"  # noqa
cn_siaw = f"mysql+mysqlconnector://{settings.SIAW_DB_USER}:{settings.SIAW_DB_PASSWORD}@{settings.SIAW_DB_HOST}:{settings.SIAW_DB_PORT}/{settings.SIAW_DB_NAME}"  # noqa

engine_siaw = create_engine(cn_siaw)
engine_sip = create_engine(cn_sip)

I really hope u find it helpful.

Upvotes: -1

Swetank Nandi
Swetank Nandi

Reputation: 61

I think we can also use

df1.merge(df2,on='Company')

Upvotes: 5

Good Fit
Good Fit

Reputation: 1316

In order to successfully merge two data frames based on common column(s), the dtype for common column(s) in both data frames must be the same! dtype for a column can be changed by:

df['commonCol'] = df['commonCol'].astype(int)

Upvotes: -5

jezrael
jezrael

Reputation: 863701

Use merge:

print (pd.merge(df1, df2, on='company'))

Sample:

print (df1)
  company standard
0    tata       A1
1     cts       A2
2    dell       A3

print (df2)
  company  return
0    tata      71
1    dell      78
2     cts      27
3     hcl      23

print (pd.merge(df1, df2, on='company'))
  company standard  return
0    tata       A1      71
1     cts       A2      27
2    dell       A3      78

Upvotes: 184

Related Questions