probat
probat

Reputation: 1532

Python 3+ Is It Possible to Replace if in and elif in Using a Switcher

I know how to use a dictionary as a switcher in Python. I'm not sure how to use one for my specific case. I think I will just need to use if, elif, and else but hopefully I am proved wrong by the community :)

I want to make a find/replace function for certain characters in strings. The string is at least one sentence but usually more and comprised of many words.

Basically what I am doing is the following:

if non-breaking hyphen in string:  # string is a sentence with many words
  replace non-breaking hyphen with dash

elif en dash in string:
  replace en dash with dash

elif em dash in string:
  replace em dash with dash

elif non-breaking space in string:
  replace non-breaking space with space

.... and so forth

The only thing I can think of is splitting the string apart into separate sub-strings and then looping through them then the dictionary switcher would work. But this would obviously add a lot of extra processing time and the purpose of using a dictionary switcher is to save time.

I could not find anything on this specific topic searching everywhere.

Is there a way to use a switcher in Python using if in and elif in?

Upvotes: 0

Views: 372

Answers (3)

Eli Korvigo
Eli Korvigo

Reputation: 10493

Although Benjamin's answer might be right, it is case-specific, while your question has a rather general-purpose tone to it. There is a universal functional approach (I've added Python 3.5 type annotations to make this code self-explanatory):

from typing import TypeVar, Callable, Iterable

A = TypeVar('A')
B = TypeVar('B')
Predicate = Callable[[A], bool]
Action = Callable[[A], B]
Switch = Tuple[Predicate, Action]

def switch(switches: Iterable[Switch], default: B, x: A) -> B:
    return next(
        (act(x) for pred, act in switches if pred(x)), default
    )

switches = [
    (lambda x: '\u2011' in x, lambda x: x.replace('\u2011', '-')),
    (lambda x: '\u2013' in x, lambda x: x.replace('\u2013', '-'))
]
a = "I'm–a–string–with–en–dashes"

switch(switches, a, a) # if no switches are matched, return the input

This is quite superfluous in your case, because your example boils down to a regex operation. Take note, while switches can be any iterable, you might want to use something with predictable iteration order, i.e. any Sequence type (e.g. list or tuple), because the first action with a matched predicate will be used.

Upvotes: 1

user3483203
user3483203

Reputation: 51175

Just to show that regex is a valid solution, and some timings:

replacements = {
    '\u2011': '-',
    '\u2013': '-',
    '\u2014': '-',
    '\u00A0': ' ', 
}

import re
s = "1‑‑‑‑2–––––––3————————"

re.sub(
    '|'.join(re.escape(x) for x in replacements),
    lambda x: replacements[x.group()], s
)
# Result
1----2-------3--------

Timings (str.trans wins and is also cleaner)

s = "1‑‑‑‑2–––––––3————————"
s *= 10000

%timeit re.sub('|'.join(re.escape(x) for x in replacements), lambda x: replacements[x.group()], s)
90.7 ms ± 182 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [733]: %timeit s.translate(trans)
15.8 ms ± 59.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Upvotes: 2

Patrick Haugh
Patrick Haugh

Reputation: 60974

Here's the str.translate solution

replacements = {
    '\u2011': '-',  # non breaking hyphen
    '\u2013': '-',  # en dash
    '\u2014': '-',  # em dash
    '\u00A0': ' ',  # nbsp
}

trans = str.maketrans(replacements)
new_string = your_string.translate(trans)

Note that this only works if you want to replace single characters from the input. {'a': 'bb'} is a valid replacements, but {'bb': 'a'} is not.

Upvotes: 4

Related Questions