Reputation: 428
So I have two DataFrames: Historic and Applet. Historic contains a list of all courses my school offered in the past and Applet is all courses that my school currently offers
I want to merge the two DataFrames so that any items in my Applet DataFrame that don't exist in Historic are added and any that do exist overwrite their copies in Historic (Some courses may have updated information and should overwrite their historic entries with that information..)
I'm currently using Historic.combine_first(Applet)
to merge the two by on their Indexes. However, I want the duplicate entries to overwrite their Historic entries not just make a duplicate entry.
Code:
def update2(self):
historic = pd.read_csv('course_history.txt', header=None, sep='"', encoding = 'ISO-8859-1',
names=['Course_ID', 'Course_Title', 'Professor_Name','Meeting_Time','Enrollment','Room','Year','Term','Credit'],index_col=[0,6,7])
winnet = pd.DataFrame(self.data, columns =['Course_ID', 'Course_Title', 'Professor_Name','Meeting_Time','Enrollment','Room','Year','Term','Credit'] )
winnet.set_index(['Course_ID','Year','Term'], inplace=True)
historic3 = historic.combine_first(winnet)
Historic DataFrame:
Course_ID Year Term ...
AC 230 01 2020-21 May Accounting Systems Crouse, Justin D. ... ROOM NULL 1.00
AC 429 01 2020-21 May CPA Review Sommermeyer, Eric ... ROOM NULL 1.00
ART 150 01 2020-21 May 20th-Century Art, Media, & Design Fedeler, Barbara J. ... ROOM NULL 1.00
ART 208 01 2020-21 May Photography I Payne, Thomas R. ... ROOM NULL 1.00
PSY 222 01 2018-19 FA Cognitive Psychology Eslick Watkins, A ... ROOM NULL 1.00
Applet DataFrame:
Course_ID Year Term
PSY 101 01 2018-19 FA Introduction to Psychology Bane, C T H 9:35AM-11:15AM 40/44/0 LH 330 1.00
PSY 101 02 2018-19 FA Introduction to Psychology Eslick Watkins, A T H 1:00PM-2:40PM 40/43/0 SC 134 1.00
PSY 210 10 2018-19 FA Child Development Munir, S T H 9:35AM-11:15AM 30/10/0 LH 327 0.50
PSY 211 20 2018-19 FA Adolescent Development Munir, S T H 1:00PM-2:40PM 30/6/0 LH 330 0.50
PSY 222 01 2018-19 FA Cognitive Psychology Eslick Watkins, A T H 9:35AM-11:15AM 30/24/0 LH 324 1.00
Upvotes: 0
Views: 32
Reputation: 3010
You can use concat
then drop_duplicates
cols = [columns_to_judge_duplicates]
combined = pd.concat([Applet, Historic])
combined = combined.drop_duplicates(subset=cols, method='first')
Upvotes: 1