Sayantan Ghosh
Sayantan Ghosh

Reputation: 338

How to replace particular values in dataframe column from a dictionary?

So, I have a table of the following manner:

Col1     Col2
ABS      45
CDC      23
POP      15

Now, I have a dictionary aa = {'A':'AD','P':'PL','C':'LC'}. So for the matching key parts only I want the values in the column to change. For the other letters which do not match the dictionary keys should remain the same.

The final table should look like:

Col1     Col2
ADBS     45
LCDLC    23
PLOPL    15

I am trying to use the following code but it is not working.

df['Col1'].str.extract[r'([A-Z]+)'].map(aa)

Upvotes: 1

Views: 791

Answers (2)

CypherX
CypherX

Reputation: 7353

Solution

df = pd.DataFrame({'Col1': ['ABS', 'CDC', 'POP'], 
                   'Col2': [45, 23, 15], 
                  })

keys = aa.keys()
df.Col1 = [''.join([aa.get(e) if (e in keys) else e for e in list(ee)]) for ee in df.Col1.tolist()]
df

Output:

enter image description here

Unpacking the Condensed List Comprehension

Let us write down the list comprehension in a more readable form. We create a function do_something to understand what is happenning in the first part of the list-comprehension. The second part (for ee in df.Col1.tolist()) essentially iterates over each row in the column 'Col1' of the dataframe df.

def do_something(x):
    # here x is like 'ABS'
    xx = '.join([aa.get(e) if (e in keys) else e for e in list(x)])
    return xx
df.Col1 = [do_something(ee) for ee in df.Col1.tolist()]

Unpacking do_something(x)

The function do_something(x) does the following. It will be easier if you try it with x = 'ABS'. The ''.join(some_list) in do_something joins the list produced. The following code block will illustrate that.

x = 'ABS'
print(do_something(x))
[aa.get(e) if (e in keys) else e for e in list(x)]

Output:

ADBS
['AD', 'B', 'S']

So what is the core logic?

The following code-block shows you step-by-step how the logic works. Obviously, the list comprehension introduced at the beginning of the solution compresses the nested for loops into a single line, and hence should be preferred over the following.

keys = aa.keys()
packlist = list()
for ee in df.Col1.tolist():
    # Here we iterate over each element of 
    # the dataframe's column (df.Col1)

    # make a temporary list
    templist = list()
    for e in list(ee):
        # here e is a single character of the string ee
        # example: list('ABS') = ['A', 'B', 'S']
        if e in keys:
            # if e is one of the keys in the dict aa
            # append the corresponding value to templist
            templist.append(aa.get(e))
        else:
            # if e is not a key in the dict aa
            # append e itself to templist
            templist.append(e)
    # append a copy of templist to packlist
    packlist.append(templist.copy())

# Finally assign the list: packlist to df.Col1 
# to update the column values
df.Col1 = packlist

References

List and dict comprehensions are some very powerful tools any python programmer would find handy and nifty while coding. They have the ability to neatly compress an otherwise elaborate code-block into merely a line or two. I would suggest that you take a look at the following.

  1. List Comprehensions: python.org
  2. Dict Comprehensions: python.org
  3. List Comprehension in Python: medium.com

Upvotes: 1

Dev Khadka
Dev Khadka

Reputation: 5451

you can do this using replace like below

df = pd.DataFrame([['ABS', '45'], ['CDC', '23'], ['POP', '15']], columns=('Col1', 'Col2'))
aa = {'A':'AD','P':'PL','C':'LC'}
pat = "|".join(aa.keys())
df["Col1"].str.replace(pat, lambda x: aa.get(x[0], x[0]))

Upvotes: 0

Related Questions