Nebulosar
Nebulosar

Reputation: 1855

Replace leading whitespace with other other char - Python

I want to replace my leading whitespace with a nbsp; per whitespace.

So:

spam --> spam
 eggs -->  eggs
  spam eggs -->   spam eggs

I've seen a couple of solutions using regex, but all are in other languages. I've tried the following in Python but with no luck.

import re

raw_line = '  spam eggs'

line = re.subn('\s+', ' ', raw_line, len(raw_line))
print(line) # outputs   spam eggs

line = re.sub('\s+', ' ', raw_line)
print(line) # outputs   spam eggs

line = re.sub('^\s', ' ', raw_line)
print(line) # outputs   spam eggs

line = re.sub('^\s+', ' ', raw_line)
print(line) # outputs  spam eggs

Last line seems to be closest, but yet no cigar.

What is the proper way to replace each leading whitespace with   in Python?

If there is a clean way to do this without regex, I will gladly accept, but I couldn't figure it out by myself.

Upvotes: 4

Views: 961

Answers (4)

Sundeep
Sundeep

Reputation: 23667

With regex module (answered in comment by Wiktor Stribiżew)

>>> import regex
>>> line = 'spam'
>>> regex.sub(r'\G\s', ' ', line)
'spam'

>>> line = ' eggs'
>>> regex.sub(r'\G\s', ' ', line)
' eggs'

>>> line = '  spam eggs'
>>> regex.sub(r'\G\s', ' ', line)
'  spam eggs'

From documentation:

\G

A search anchor has been added. It matches at the position where each search started/continued and can be used for contiguous matches or in negative variable-length lookbehinds to limit how far back the lookbehind goes

Upvotes: 1

Austin
Austin

Reputation: 26039

A non regex solution:

s = '  spam eggs'
s_s = s.lstrip()
print(' '*(len(s) - len(s_s)) + s_s)
#   spam eggs

Upvotes: 0

zwer
zwer

Reputation: 25789

You don't even need expensive regex here, just strip out the leading whitespace and prepend a number of   characters for the number of stripped characters:

def replace_leading(source, char=" "):
    stripped = source.lstrip()
    return char * (len(source) - len(stripped)) + stripped

print(replace_leading("spam"))         # spam
print(replace_leading(" eggs"))        #  eggs
print(replace_leading("  spam eggs"))  #   spam eggs

Upvotes: 5

tobias_k
tobias_k

Reputation: 82899

You can use re.sub with a callback function and evaluate the length of the match:

>>> raw_line = '  spam eggs'
>>> re.sub(r"^\s+", lambda m: " " * len(m.group()), raw_line)
'  spam eggs'

Upvotes: 1

Related Questions