aznxpress4men
aznxpress4men

Reputation: 13

Python Code to Replace First Letter of String: String Index Error

Currently, I am working on parsing resumes to remove "-" only when it is used at the beginning of each line. I've tried identifying the first character of each string after the text has been split. Below is my code:

for line in text.split('\n'):
    if line[0] == "-":
        line[0] = line.replace('-', ' ')

line is a string. This is my way of thinking but every time I run this, I get the error IndexError: string index out of range. I'm unsure of why because since it is a string, the first element should be recognized. Thank you!

Upvotes: 1

Views: 1811

Answers (2)

Gaurang Shah
Gaurang Shah

Reputation: 12910

this could be due to empty lines. You could just check the length before taking the index.

new_text = []
text="-testing\nabc\n\n\nxyz"
for line in text.split("\n"):
    if line and line[0] == '-':
        line = line[1:]
    new_text.append(line)

print("\n".join(new_text))

Upvotes: 0

Jean-François Fabre
Jean-François Fabre

Reputation: 140168

The issue you're getting is because some lines are empty.

Then your replacement is wrong:

  • first because it will assign the first "character" of the line but you cannot change a string because it's immutable
  • second because the replacement value is the whole string minus some dashes
  • third because line is lost at the next iteration. The original list of lines too, by the way.

If you want to remove the first character of a string, no need for replace, just slice the string (and don't risk to remove other similar characters).

A working solution would be to test with startswith and rebuild a new list of strings. Then join back

text = """hello
-yes--
who are you"""

new_text = []

for line in text.splitlines():
    if line.startswith("-"):
        line = line[1:]
    new_text.append(line)

print("\n".join(new_text))

result:

hello
yes--
who are you

with more experience, you can pack this code into a list comprehension:

new_text = "\n".join([line[1:] if line.startswith("-") else line for line in text.splitlines()])

finally, regular expression module is also a nice alternative:

import re
print(re.sub("^-","",text,flags=re.MULTILINE))

this removes the dash on all lines starting with dash. Multiline flag tells regex engine to consider ^ as the start of the line, not the start of the buffer.

Upvotes: 4

Related Questions