Reputation: 374
I'm having trouble using groupby
and aggregation
when working w/ multiple dataframes. I'm trying to calculate num_maint_over_$90
from two different dataframes.
cars_dict = {"ABC123": ["Ford", "Compact_Car"], "XYZ148": ["Chevy", "Truck"], "ASX133": ["Ford", "Truck"], "ADS111": ["Porsche", "Sports_Car"], "SSC119": ["Toyota", "Compact_Car"]}
cars = pd.DataFrame.from_dict(cars_dict, orient = 'index')
cars.columns = ["Manufacturer", "Type"]
cars.index.rename("License_Plate", inplace = True)
maintenance_dict = {"License_Plate": ["ABC123", "ABC123", "ABC123", "XYZ148", "ASX133", "ASX133", "ADS111", "ADS111", "SSC119"], "Cost": [60, 100, 200, 150, 40, 199, 33, 99, 0]}
maintenance_records = pd.DataFrame.from_dict(maintenance_dict)
maintenance_records.index.rename("order_num", inplace = True)
* cars: *
Manufacturer Type
License_Plate
XYZ148 Chevy Truck
SSC119 Toyota Compact_Car
ASX133 Ford Truck
ABC123 Ford Compact_Car
ADS111 Porsche Sports_Car
* maintenance_records: *
Cost License_Plate
order_num
0 60 ABC123
1 100 ABC123
2 200 ABC123
3 150 XYZ148
4 40 ASX133
5 199 ASX133
6 33 ADS111
7 99 ADS111
8 0 SSC119
*desired df: *
Type num_maint_over_$90
Compact_Car 2
Sports_Car 1
Truck 2
I tried using groupby
, apply()
, and for loop
.
Upvotes: 1
Views: 4138
Reputation: 13913
merged = pd.merge(maintenance_records, cars, how='left',
left_on='License_Plate', right_index=True)
merged.query('Cost > 90')['Type'].value_counts()
Upvotes: 2
Reputation: 374
Here is the naive for-loop solution:
car_types = {}
for index, row in cars.iterrows():
car_type = row["Type"]
if car_type not in car_types:
car_types[car_type] = 0
for index, row in maintenance_records.iterrows():
if row["Cost"] > 90:
car_license = row["License_Plate"]
car_type = cars.loc[car_license,"Type"]
car_types[car_type] += 1
df = pd.DataFrame.from_dict(car_types, orient = "index")
df.index.rename("Type", inplace = True)
df.columns = ["num_maint_over_$90"]
* df: *
num_maint_over_$90
Type
Sports_Car 1
Compact_Car 2
Truck 2
Upvotes: 0