Reputation: 1
I have two xlsx files as follows:
value1 value2
3900162750 10
3900163003 19
2311009200 22
value1 value2
3900163003 5
3900162750 9
2311009200 88
How do I match value1
from xlsx1
with xlsx2
and compare related value2
?
for instance, match 3900163003
from xlsx1
with xlsx2
and find out value2
is decreasing.
Upvotes: 0
Views: 543
Reputation: 1388
Use pandas
and pandas merge
:
import pandas as pd
df = df1.merge(df2, on='value1', how= 'outer', suffixes=('_df1', '_df2'))
df['decrease'] = df['value2_df2'] < df['value2_df1']
giving:
value1 value2_df1 value2_df2 decrease
0 3900162750 10 9 True
1 3900163003 19 5 True
2 2311009200 22 88 False
Notes:
how
parameter used to provide some output if df1
and df2
are not perfectly matchingsuffixes
parameter used to provide the origin of the data'decrease'
column to remember the way it is computedData from :
data1 = """
value1 value2
3900162750 10
3900163003 19
2311009200 22
"""
data2 = """
value1 value2
3900163003 5
3900162750 9
2311009200 88
"""
df1 = pd.read_csv(io.StringIO(data1), sep=r' +')
df2 = pd.read_csv(io.StringIO(data2), sep=r' +')
In real live read_csv
should be replaced by read_excel
Upvotes: 0
Reputation:
You can use pandas
. First read the excel files as pandas.DataFrame
objects. Then using pd.DataFrame.merge
method, merge the two dataframes on 'value1'
column. Then create another column 'Decreasing'
by comparing the two 'value2'
columns. By default, the column from the left (in this case, df1
) gets suffix _x
and the one from the right (in this case, df2
) gets suffix _y
:
import pandas as pd
df1 = pd.read_excel('first_excel_file.xlsx')
df2 = pd.read_excel('second_excel_file.xlsx')
df = df1.merge(df2, on='value1')
df['Decreasing'] = df['value2_x'] > df['value2_y']
Output:
value1 value2_x value2_y Decreasing
0 3900162750 10 9 True
1 3900163003 19 5 True
2 2311009200 22 88 False
Upvotes: 1