pythonRcpp
pythonRcpp

Reputation: 2146

map according to key column in different file using python

I have a log file like below

Source log file

OrderID,PathCode
1000,0
2000,1
3000,2
4000,1
5000,2
6000,0
7000,2
8000,1
9000,0

But Destination log file doesnt contain the PathCode it came from, it only knows the sequence as below.

OrderID
3000
9999 #Ignore this since not in source OrderID
1000
2000
5000
9991 #Ignore this since not in source OrderID
4000
6000
9000
8000
7000
9998 #Ignore this since not in source OrderID

How to add PathCode to the dest file. OrderID is always unique

Expected Output

OrderID,PathCode
3000,2
1000,0
2000,1
5000,2
4000,1
6000,0
9000,0
8000,1
7000,2

Basically I have to sort source file according to sequence given in destination file

Upvotes: 1

Views: 57

Answers (1)

Vaishali
Vaishali

Reputation: 38415

You can use map

dest['pathcode'] = dest['OrderID'].map(source.set_index('OrderID')['PathCode'])

You get

    OrderID pathcode
0   3000    2
1   1000    0
2   2000    1
3   5000    2
4   4000    1
5   6000    0
6   9000    0
7   8000    1
8   7000    2

To get the group of 3 rows and find the first pathcode try,

dest.iloc[0::3, 1].value_counts() #df.iloc[0::3] returns row no 0,3,6,9 etc

2    2
0    1

so pathcode 2 came first 2 times and 0, 1 time.

EDIT: I tried the code with your test case dest file. The map works as expected and you need to dropna but you also need to reindex for the slice to work.

dest.dropna(inplace=True)
dest.reset_index(drop=True, inplace=True)
dest.iloc[0::3, 1].value_counts()

You get

2    2
0    1

Upvotes: 1

Related Questions