Paul Nathan
Paul Nathan

Reputation: 40309

'in-place' string modifications in Python

In Python, strings are immutable.

What is the standard idiom to walk through a string character-by-character and modify it?

The only methods I can think of are some genuinely stanky hacks related to joining against a result string.

--

In C:

for(int i = 0; i < strlen(s); i++)
{
   s[i] = F(s[i]);
}

This is super expressive and says exactly what I am doing. That is what I am looking for.

Upvotes: 41

Views: 92249

Answers (16)

In Python, strings are immutable.

In Python, if you get an id(string) and keep a reference to the string, you can depend on that string being immutable. Every implementation detail that uses strings depends on it.

Foremost, collections that hash keys depend on the string being immutable. So, if you modify a string in-place using e.g. the method from this answer, you must be absolutely sure that the string is not used anywhere else. That includes the debugger and clients of the debugger, e.g. an IDE.

So, while it is obviously technically possible to mutate a string, doing so will break things, and the most likely things to subtly break will be the profiler, debugger, and any modules that instrument, introspect or generate code, e.g. attrs.

In other words, it's not merely "a bad idea" to mutate Python strings. It leads to C-style undefined behavior. That's the last thing you want in a Python program!

What is the standard idiom to walk through a string character-by-character and modify it?

There is none. You can create a new string in any way you want. E.g.

def repeated_chars(s: str, n: int) -> str:
    """
    Returns a string with each original character repeated n times, in order.
    >>> repeated_chars('12', 3)
    '111222'
    """
    result = ""
    for ch in s:
       result += ch*n
    return result

Upvotes: 0

AbhinayBoda
AbhinayBoda

Reputation: 1

Here's my pythonic solution for In-place string reversal.

Accounts for white spaces too.

Note: It won't match any special characters if included in input_string except for underscore ( '_' )

i/p - "Hello World" => o/p - "olleH dlroW"

import re

def inplace_reversal(input_string):
    
    list_of_strings = re.findall(r'\s|(\w+)',input_string)
    
    output_string= ''
    
    for string in list_of_strings:
        
        if string == '':
            
            output_string += ' '
        
        else:
            
            output_string += string[::-1]
    
    return output_string

print(inplace_reversal('__Hello__ __World__         __Hello__       __World__ '))

>>> __olleH__ __dlroW__         __olleH__       __dlroW__ 

Upvotes: -1

Jie.Gao
Jie.Gao

Reputation: 1

I did that like this:

import tempfile
import shutil

...

f_old = open(input_file, 'r')
with tempfile.NamedTemporaryFile() as tmp:
    for line in f_old:
        tmp.write(line.replace(old_string, new_string))
    f_old.close()
    tmp.flush()
    os.fsync(tmp)
    shutil.copy2(tmp.name, input_file)
    tmp.close()

Upvotes: 0

Michael Lipp
Michael Lipp

Reputation: 400

The question first states that strings are immutable and then asks for a way to change them in place. This is kind of contradictory. Anyway, as this question pops up at the top of the list when you search for "python string in-place modification", I'm adding the answer for a real in place change.

Strings seem to be immutable when you look at the methods of the string class. But no language with an interface to C can really provide immutable data types. The only question is whether you have to write C code in order to achieve the desired modification.

Here python ctypes is your friend. As it supports getting pointers and includes C-like memory copy functions, a python string can be modified in place like this:

s = 16 * "."
print s
ctypes.memmove(ctypes.c_char_p(s), "Replacement", 11)
print s

Results in:

................
Replacement.....

(Of course, you can calculate the replacement string at runtime by applying a function F to every character of the original string. Different ways how to do this have been shown in the previous answers.)

Note that I do not in any way encourage doing this. However, I had to write a replacement for a class that was mapped from C++ to python and included a method:

int readData(char* data, int length)

(The caller is supposed to provide memory with length bytes and the method then writes the available data -- up to length -- into that memory, returning the number of bytes written.) While this is a perfectly sensible API in C/C++, it should not have been made available as method of a python class or at least the users of the API should be made aware that they may only pass mutable byte arrays as parameter.

As you might expect, "common usage" of the method is as shown in my example (create a string and pass it together with its length as arguments). As I did not really want to write a C/C++ extension I had to come up with a solution for implementing the behavior in my replacement class using python only.

Upvotes: 6

Odomontois
Odomontois

Reputation: 16308

You can use StringIO class to receive file-like mutable interface of string.

Upvotes: 0

Javier
Javier

Reputation: 4623

I'd say the most Pythonic way is to use map():

s = map(func, s) # func has been applied to every character in s

This is the equivalent of writing:

s = "".join(func(c) for c in s)

Upvotes: 7

Tim McNamara
Tim McNamara

Reputation: 18385

string.translate is probably the closest function to what you're after.

Upvotes: 3

bstpierre
bstpierre

Reputation: 31206

Don't use a string, use something mutable like bytearray:

#!/usr/bin/python

s = bytearray("my dog has fleas")
for n in xrange(len(s)):
    s[n] = chr(s[n]).upper()
print s

Results in:

MY DOG HAS FLEAS

Edit:

Since this is a bytearray, you aren't (necessarily) working with characters. You're working with bytes. So this works too:

s = bytearray("\x81\x82\x83")
for n in xrange(len(s)):
    s[n] = s[n] + 1
print repr(s)

gives:

bytearray(b'\x82\x83\x84')

If you want to modify characters in a Unicode string, you'd maybe want to work with memoryview, though that doesn't support Unicode directly.

Upvotes: 25

Ned Batchelder
Ned Batchelder

Reputation: 375574

The Python analog of your C:

for(int i = 0; i < strlen(s); i++)
{
   s[i] = F(s[i]);
}

would be:

s = "".join(F(c) for c in s)

which is also very expressive. It says exactly what is happening, but in a functional style rather than a procedural style.

Upvotes: 19

John La Rooy
John La Rooy

Reputation: 304147

Here is an example using translate to switch "-" with "." and uppercase "a"s

>>> from string import maketrans
>>> trans_table = maketrans(".-a","-.A")
>>> "foo-bar.".translate(trans_table)
'foo.bAr-'

This is much more efficient that flipping to byte array and back if you just need to do single char replacements

Upvotes: 1

killown
killown

Reputation: 4917

you can use the UserString module:

 >>> import UserString
... s = UserString.MutableString('Python')
... print s
Python
>>> s[0] = 'c'
>>> print s
cython

Upvotes: 11

Zimm3r
Zimm3r

Reputation: 3425

If I ever need to do something like that I just convert it to a mutable list

For example... (though it would be easier to use sort (see second example) )

>>> s = "abcdfe"
>>> s = list(s)
>>> s[4] = "e"
>>> s[5] = "f"
>>> s = ''.join(s)
>>> print s
abcdef
>>>
# second example
>>> s.sort()
>>> s = ''.join(s)

Upvotes: 1

David Z
David Z

Reputation: 131570

Assigning a particular character to a particular index in a string is not a particularly common operation, so if you find yourself needing to do it, think about whether there may be a better way to accomplish the task. But if you do need to, probably the most standard way would be to convert the string to a list, make your modifications, and then convert it back to a string.

s = 'abcdefgh'
l = list(s)
l[3] = 'r'
s2 = ''.join(l)

EDIT: As posted in bstpierre's answer, bytearray is probably even better for this task than list, as long as you're not working with Unicode strings.

s = 'abcdefgh'
b = bytearray(s)
b[3] = 'r'
s2 = str(b)

Upvotes: 2

Joe Koberg
Joe Koberg

Reputation: 26699

def modifyIdx(s, idx, newchar):
    return s[:idx] + newchar + s[idx+1:]

Upvotes: 0

Jungle Hunter
Jungle Hunter

Reputation: 7285

>>> mystring = "Th1s 1s my str1ng"
>>> mystring.replace("1", "i")
'This is my string'

If you want to store this new string you'll have to mystring = mystring.replace("1", "i"). This is because in Python strings are immutable.

Upvotes: 1

jathanism
jathanism

Reputation: 33716

Strings are iterable and can be walked through like lists. Strings also have a number of basic methods such as .replace() that might be what you're looking for. All string methods return a new string. So instead of modifying the string in place you can simply replace its existing value.

>>> mystring = 'robot drama'
>>> mystring = mystring.replace('r', 'g')
>>> mystring
'gobot dgama'

Upvotes: 2

Related Questions