Demi
Demi

Reputation: 234

How to replace one column variables with variables in another table (using regex) python, is it even possible?

I have two datasets. First dataset includes all raw values that must be replaced with acceptable values that are given in the second dataset. If matching acceptable value is not found in second dataset, then leave it its own way.

First looks like this:

SOURCE_ID TITLE
1 Emaar Beachfront
2 EmaarBeachfront
3 emaar beachfront
4 dubai hills estate
5 Dubai Hills
6 Nad Al Sheba
7 Nadalsheba
8 dubai hills residences
9 The Cove Ru
10 Homes

Second looks like this:

ID TITLE
1 Emaar Beachfront
2 Dubai Hills
3 Nad Al Sheba
4 The Cove

So that in the end my dataset looks like this:

SOURCE_ID TITLE
1 Emaar Beachfront
2 Emaar Beachfront
3 Emaar Beachfront
4 Dubai Hills
5 Dubai Hills
6 Nad Al Sheba
7 Nad Al Sheba
8 Dubai Hills
9 The Cove
10 Homes

I thought it is possible via regex, but i am not sure

Upvotes: 0

Views: 54

Answers (1)

user11718531
user11718531

Reputation:

One solution could be this:

first = ["Emaar Beachfront",
"EmaarBeachfront",
"emaar beachfront",
"dubai hills estate",
"Dubai Hills",
"Nad Al Sheba",
"Nadalsheba",
"dubai hills residences",
"The Cove Ru",
"Homes"]

second = [
"Emaar Beachfront",
"Dubai Hills",
"Nad Al Sheba",
"The Cove"
]

second_transformed = [item.replace(" ", "").lower() for item in second]

out = []

for item in first:
    item_transformed = item.replace(" ", "").lower()
    item_found = False
    for second_item, second_item_transformed in zip(second, second_transformed):
        if second_item_transformed in item_transformed:
            out.append(second_item)
            item_found = True
            break
    if not item_found:
        out.append(item)

print(out)

Upvotes: 1

Related Questions