AlexW
AlexW

Reputation: 2587

python pandas - Merge and override with logic?

I have two data frames with similar data in as per the below:

sample table 1:

device_id device_type     ip_address     mac_address     port  vlan
       89      Router   10.10.10.13  ad3d.bb39.484a     None   5.0
       89      Router   10.10.10.21  0010.1d1a.7b67     None   5.0
       89      Router   10.10.10.22  0010.4d5a.768d     None   5.0
       89      Router   10.10.10.23  0010.096a.7a8c     None   5.0

sample table 2:

device_id device_type     mac_address      port
      305      Switch  ad3d.bb39.484a   Gi1/0/3
      305      Switch  0010.1d1a.7b67   Gi1/0/4
      305      Switch  0010.4d5a.768d   Gi1/0/2
      305      Switch  9a72.2dad.21f0  Gi1/0/22

I want to merge these two on Mac address, which is currently done with

temp_merged_data = arp_data.merge(mac_data, on='mac_address', how='left')

this gives data

device_id_x device_type_x   ip_address     mac_address     port_x  vlan  device_id_y  device_type_y      port_y 
         89      Router     10.10.10.13  ad3d.bb39.484a      None   5.0          305         Router     Gi1/0/3
         89      Router     10.10.10.21  0010.1d1a.7b67      None   5.0          305         Router     Gi1/0/4
         89      Router     10.10.10.22  0010.4d5a.768d      None   5.0          305         Router     Gi1/0/2
         89      Router     10.10.10.23  0010.096a.7a8c      None   5.0          305         Router    Gi1/0/22

What I would like to do is merge the columns with an overwrite, in preference of switch... i.e. if a Mac duplicate Mac address exists and It has a router and a switch as device_type I would like to merge device_id, port, and device_type from the switch and drop the data from the router. if a duplicate does not exist use the data from the router.

is this possible with simple logic or are complex functions needed?

Upvotes: 0

Views: 53

Answers (1)

w-m
w-m

Reputation: 11232

You could concatenate the two data frames, sort them by device type and then pick the first entry for each mac address:

>>> pd.concat([arp_data, mac_data]) \
      .sort_values("device_type", ascending=False) \
      .groupby("mac_address").first() \
      .reset_index()

      mac_address  device_id device_type   ip_address      port  vlan
0  0010.096a.7a8c         89      Router  10.10.10.23      None   5.0
1  0010.1d1a.7b67        305      Switch  10.10.10.21   Gi1/0/4   5.0
2  0010.4d5a.768d        305      Switch  10.10.10.22   Gi1/0/2   5.0
3  9a72.2dad.21f0        305      Switch          NaN  Gi1/0/22   NaN
4  ad3d.bb39.484a        305      Switch  10.10.10.13   Gi1/0/3   5.0

Upvotes: 1

Related Questions