Reputation: 1238
I have two csv file like below.
In 'order.csv'
date orderName orderNumber
1/7 Tom 1
1/7 Jeny 4
1/7 Brown 2
1/7 Tom 3
1/8 Sky 5
1/8 Blue 7
1/8 Red 6
1/8 Wine 8
1/9 Tom 9
1/9 Earth 11
1/9 Earth 10
In 'run.csv'
date runName runNumber
1/7 Tom
1/7 Jeny
1/7 Tom
1/7 Brown
1/8 Sky
1/8 Blue
1/8 Red
1/8 Wine
1/9 Tom
1/9 Earth
1/9 Earth
My goal is that the orderNumber has to go inside of runNumber matching the date and orderName. But the thing is that the order is different between order and run csv. And if the orderName is duplicated on the same date. It doesn't have to considerate.
In 'result.csv'
date runName runNumber
1/7 Tom
1/7 Jeny 4
1/7 Tom
1/7 Brown 2
1/8 Sky 5
1/8 Blue 7
1/8 Red 6
1/8 Wine 8
1/9 Tom 9
1/9 Earth
1/9 Earth
How can I make using methods of python for this case?
I tried to do something but for me it's too hard problem.
So I asked here entirely. I am sorry.
Upvotes: 0
Views: 63
Reputation: 26
From what I have gone through, it seems like you don't have to have the two files since those exactly have the same structure. The first thing you can do is loading the csv file with Pandas like this below.
import pandas as pd
df = pd.read_csv('order.csv')
Then, you can add a new column based on the other columns. And to get rid of duplicates, using Counter which is default Python library might be a good option.
from collections import Counter
name_cnt = Counter(df['orderName'])
def set_run_number(row):
if name_cnt[row['orderName']] > 1:
return
else:
return int(row['orderNumber'])
df['runNumber'] = df.apply(lambda row: set_run_number(row), axis=1)
And it goes like this.
date orderName orderNumber runNumber
0 1/7 Tom 1 NaN
1 1/7 Jeny 4 4.0
2 1/7 Brown 2 2.0
3 1/7 Tom 3 NaN
4 1/8 Sky 5 5.0
5 1/8 Blue 7 7.0
6 1/8 Red 6 6.0
7 1/8 Wine 8 8.0
8 1/9 Tom 9 NaN
9 1/9 Earth 11 NaN
10 1/9 Earth 10 NaN
Upvotes: 1