python - replace string in bytes (one more 'str' does not support the buffer interface)

Question

I have a 'cell' variable. Please note it is NOT a htm or html file. It is the content of .xlsx file cell. The text in it has many links (there are only 2 here for example), and they all should be replaced.

There is also a txt file with original links and links for replacement. After parsing of the txt file we have 2 lists:

is_list - list of links which should be deleted
should_be_list - list of links which should be instead of the deleted ones.

so

import re

cell = b' About Us
    
 A Caring Home Care Services started in 2007 in Southwestern Louisiana. Our Mission is to provide quality Homemodel. 
 Below is a list of services you will provide as a Franchisee 
  Apartment and Home Cleaning
 Chef Services
 Handyman and Remodeling Services
 In-Home Non-Medical Elderly Care
 Interior Decorator
 Lawn Care Services
 
   
 If you are an Entrepreneur and looking to get in the Home care Industry, then A Caring Home Care today, and we will mail you out our Franchisee Information Booklet. Come join our winning TEAM.
 '

is_list = ['',
        ' ']

should_be_list = ['',
        ' ']

if I try to use replace - I get this error:

for i in range(2):
    cell.replace(is_list[i], should_be_list[i])

print (cell)
"""
Traceback (most recent call last):
  File "I:\15.py", line 11, in 
    cell.replace(is_list[i], should_be_list[i])
TypeError: 'str' does not support the buffer interface

"""

if I try to use re.sub, I get this error:

for i in range(2):
    result = re.sub(is_list[i], should_be_list[i], cell)
print (cell)

"""
Traceback (most recent call last):
  File "I:\15.py", line 24, in 
    result = re.sub(is_list[i], should_be_list[i], cell)
  File "c:\Python34\lib\re.py", line 179, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "c:\Python34\lib\re.py", line 294, in _compile
    p = sre_compile.compile(pattern, flags)
  File "c:\Python34\lib\sre_compile.py", line 568, in compile
    p = sre_parse.parse(p, flags)
  File "c:\Python34\lib\sre_parse.py", line 760, in parse
    p = _parse_sub(source, pattern, 0)
  File "c:\Python34\lib\sre_parse.py", line 370, in _parse_sub
    itemsappend(_parse(source, state))
  File "c:\Python34\lib\sre_parse.py", line 516, in _parse
    raise error("bad character range")
sre_constants.error: bad character range
"""

Please, help. How to do this replacement?

tdelaney · Accepted Answer

Encode the text and use that. I'm choosing ascii because I don't know enough about how the original text files and embedded urls are encoded. There are several ways to deal with url encodings (and hostname tended to be different than path and query) and I think I'll avoid touching that third rail here.

is_list_b = [item.encode('ascii') for item in is_list]
should_be_list_b = [teim.encode('ascii') for item in should_be_list]

...

python - replace string in bytes (one more 'str' does not support the buffer interface)

Answers (1)

Related Questions

python - replace string in bytes (one more &#39;str&#39; does not support the buffer interface)

Answers (1)

Related Questions

python - replace string in bytes (one more 'str' does not support the buffer interface)