O.rka
O.rka

Reputation: 30757

How to use a nested dictionary with .map for a Pandas Series? pd.Series([]).map

I'm trying to map certain values of a series, while keeping the others intact. In this case, I was to change dmso --> dmso-2, naoh --> naoh-2, and water --> water-2 but I'm getting a KeyError.

First I'm doing a boolean statement to see if it's any of the ones of interest, if True then use this dictionary, if False then just return x. I could manually go in and change them but programming is fun and I can't figure out why this logic doesn't work.

# A sample of the series
Se_data = pd.Series({
    'DMSO_S43': 'dmso',
    'DMSO_S44': 'dmso',
    'DOXYCYCLINE-HYCLATE_S25': 'doxycycline-hyclate',
    'DOXYCYCLINE-HYCLATE_S26': 'doxycycline-hyclate'
})

# This boolean works
Se_data.map(lambda x: x in {"dmso", "naoh", "water"})
# DMSO_S43                          True
# DMSO_S44                          True
# DOXYCYCLINE-HYCLATE_S25          False
# DOXYCYCLINE-HYCLATE_S26          False

# This dictionary on the boolean works
Se_data.map(lambda x: {True: "control", False: x}[x in {"dmso", "naoh", "water"}])
# DMSO_S43                                           control
# DMSO_S44                                           control
# DOXYCYCLINE-HYCLATE_S25                doxycycline-hyclate
# DOXYCYCLINE-HYCLATE_S26                doxycycline-hyclate

# This nested dictionary isn't working
Se_data.map(lambda x: {
    True: {"dmso": "dmso-2", "naoh": "naoh-2", "water": "water-2"}[x],
    False: x
}[x in {"dmso", "naoh", "water"}])
# KeyError: 'doxycycline-hyclate'

Upvotes: 1

Views: 632

Answers (1)

Igor Raush
Igor Raush

Reputation: 15240

If I understood correctly, you can do simply

Se_data.replace({
    'dmso': 'dmso-2',
    'naoh': 'naoh-2',
    'water': 'water-2',
})

which will leave all other values intact.


For what it's worth, your code wasn't working because the expression

{"dmso": "dmso-2", "naoh": "naoh-2", "water": "water-2"}[x]

is evaluated for all x, not just the x in {"dmso", "naoh", "water"}. Values in Python dictionaries aren't short-circuited or evaluated lazily like you expected. You could have done something like

Se_data.map(lambda x: {
    "dmso": "dmso-2",
    "naoh": "naoh-2",
    "water": "water-2"
}[x] if x in {"dmso", "naoh", "water"} else x)

or

Se_data.map(lambda x: {
    "dmso": "dmso-2",
    "naoh": "naoh-2",
    "water": "water-2"
}.get(x, x))

Upvotes: 1

Related Questions