root
root

Reputation: 705

How to insert a character after every 2 characters in a string

Is there a pythonic way to insert an element into every 2nd element in a string?

I have a string: 'aabbccdd' and I want the end result to be 'aa-bb-cc-dd'.

I am not sure how I would go about doing that.

Upvotes: 58

Views: 79146

Answers (9)

Guy Gangemi
Guy Gangemi

Reputation: 1773

Using only generators is possible with current python.

Pros of this solution:

  • 'length' is specified once and can be changed without needing to update the formula.

  • Iterators all the way use minimal memory.

  • Doesn't use indices (I'm calling it, they're unpythonic)

    from itertools import batched
    
    s = 'aabbccdd'
    r = '-'.join(''.join(b) for b in batched(s, 2))
    

Batched returns 2 chars from the string. They are 'joined' with a null char. Each resultant string is 'joined' with a '-' to those that came before until the string is exhausted.

Upvotes: 0

Noam-N
Noam-N

Reputation: 924

I added tests to @SilentGhost's answer

def insert_between_every_n_characters(original: str, inserted: str, step: int) -> str:
    """
    Insert a string between every N characters.

    >>> insert_between_every_n_characters('aabbccdd', '--', 1)
    'a--a--b--b--c--c--d--d'

    >>> insert_between_every_n_characters('aabbccdd', '-', 2)
    'aa-bb-cc-dd'

    >>> insert_between_every_n_characters('aabbccd', ':', 3)
    'aab:bcc:d'

    >>> insert_between_every_n_characters('aabbccdda', ':', 3)
    'aab:bcc:dda'

    >>> insert_between_every_n_characters('a', '-', 2)
    'a'

    >>> insert_between_every_n_characters('', '-', 2)
    ''
    """
    if step <= 0:
        raise ValueError(f"step must be greater than zero. Got: {step}")
    return inserted.join(original[i : i + step] for i in range(0, len(original), step))

Upvotes: 0

Peter Hansen
Peter Hansen

Reputation: 22087

I tend to rely on a regular expression for this, as it seems less verbose and is usually faster than all the alternatives. Aside from having to face down the conventional wisdom regarding regular expressions, I'm not sure there's a drawback.

>>> s = 'aabbccdd'
>>> '-'.join(re.findall('..', s))
'aa-bb-cc-dd'

This version is strict about actual pairs though:

>>> t = s + 'e'
>>> '-'.join(re.findall('..', t)) 
'aa-bb-cc-dd'

... so with a tweak you can be tolerant of odd-length strings:

>>> '-'.join(re.findall('..?', t))
'aa-bb-cc-dd-e'

Usually you're doing this more than once, so maybe get a head start by creating a shortcut ahead of time:

PAIRS = re.compile('..').findall

out = '-'.join(PAIRS(in))

Or what I would use in real code:

def rejoined(src, sep='-', _split=re.compile('..').findall):
    return sep.join(_split(src))

>>> rejoined('aabbccdd', sep=':')
'aa:bb:cc:dd'

I use something like this from time to time to create MAC address representations from 6-byte binary input:

>>> addr = b'\xdc\xf7\x09\x11\xa0\x49'
>>> rejoined(addr[::-1].hex(), sep=':')
'49:a0:11:09:f7:dc'

Upvotes: 7

Nuno Andr&#233;
Nuno Andr&#233;

Reputation: 5377

As PEP8 states:

Do not rely on CPython's efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b. This optimization is fragile even in CPython (it only works for some types) and isn't present at all in implementations.

A pythonic way of doing this that avoids this kind of concatenation, and allows you to join iterables other than strings could be:

':'.join(f'{s[i:i+2]}' for i in range(0, len(s), 2))

And another more functional-like way could be:

':'.join(map('{}{}'.format, *(s[::2], s[1::2]))) 

This second approach has a particular feature (or bug) of only joining pairs of letters. So:

>>> s = 'abcdefghij'
'ab:cd:ef:gh:ij'

and:

>>> s = 'abcdefghi'
'ab:cd:ef:gh'

Upvotes: -1

Tony Veijalainen
Tony Veijalainen

Reputation: 5555

Here is one list comprehension way with conditional value depending of modulus of enumeration, odd last character will be in group alone:

for s  in ['aabbccdd','aabbccdde']:
    print(''.join([ char if not ind or ind % 2 else '-' + char
                    for ind,char in enumerate(s)
                    ]
                  )
          )
""" Output:
aa-bb-cc-dd
aa-bb-cc-dd-e
"""

Upvotes: 1

Dave Kirby
Dave Kirby

Reputation: 26572

If you want to preserve the last character if the string has an odd length, then you can modify KennyTM's answer to use itertools.izip_longest:

>>> s = "aabbccd"
>>> from itertools import izip_longest
>>> '-'.join(a+b for a,b in izip_longest(s[::2], s[1::2], fillvalue=""))
'aa-bb-cc-d'

or

>>> t = iter(s)
>>> '-'.join(a+b  for a,b in izip_longest(t, t, fillvalue=""))
'aa-bb-cc-d'

Upvotes: 5

kennytm
kennytm

Reputation: 523544

Assume the string's length is always an even number,

>>> s = '12345678'
>>> t = iter(s)
>>> '-'.join(a+b for a,b in zip(t, t))
'12-34-56-78'

The t can also be eliminated with

>>> '-'.join(a+b for a,b in zip(s[::2], s[1::2]))
'12-34-56-78'

The algorithm is to group the string into pairs, then join them with the - character.

The code is written like this. Firstly, it is split into odd digits and even digits.

>>> s[::2], s[1::2]
('1357', '2468')

Then the zip function is used to combine them into an iterable of tuples.

>>> list( zip(s[::2], s[1::2]) )
[('1', '2'), ('3', '4'), ('5', '6'), ('7', '8')]

But tuples aren't what we want. This should be a list of strings. This is the purpose of the list comprehension

>>> [a+b for a,b in zip(s[::2], s[1::2])]
['12', '34', '56', '78']

Finally we use str.join() to combine the list.

>>> '-'.join(a+b for a,b in zip(s[::2], s[1::2]))
'12-34-56-78'

The first piece of code is the same idea, but consumes less memory if the string is long.

Upvotes: 60

chryss
chryss

Reputation: 7519

This one-liner does the trick. It will drop the last character if your string has an odd number of characters.

"-".join([''.join(item) for item in zip(mystring1[::2],mystring1[1::2])])

Upvotes: 0

SilentGhost
SilentGhost

Reputation: 319871

>>> s = 'aabbccdd'
>>> '-'.join(s[i:i+2] for i in range(0, len(s), 2))
'aa-bb-cc-dd'

Upvotes: 78

Related Questions