jon-el
jon-el

Reputation: 39

Python: Removing backslashes inside a string

New to Python here, and trying to get the hang of regular expressions.

I'm trying to remove backslashes from inside a string. It's part of a function that pulls comments from Reddit, cleans them up, and makes them into one long string (or, at least that's my aim). When I run the function, the text comes through with an additional backslash where there was an apostrophe in the original text, e.g. " It\'s been a few years "

I know there are other posts on the topic, and I've tried the resulting recommendations, .replace("\", "") and .replace("\\", ""). No luck. Also no luck with .decode.

I'm clearly missing something. Any ideas?

PS — Unrelated, but is it possible to gang up the .sub clauses in the way you can with the .replace ones, rather than have each one on a new line?

Thanks in advance!

list_reddit = [] 
subreddit = reddit.subreddit('politics') 
hot_python = subreddit.hot()
hot_python = subreddit.hot(limit=1)
for submission in hot_python:
    comments = submission.comments
    for comment in comments:
        reddit_text = comment.body
        nospaces = reddit_text.replace('\n',' ').replace('&#039', ' ')
        formatone = re.sub(r"http\S+", ' ', nospaces)
        formattwo = re.sub(r"https\S+", ' ', formatone)
        list_reddit.append(formattwo)
        onestring = ' '.join(list_reddit)

Upvotes: 3

Views: 4232

Answers (1)

Allan
Allan

Reputation: 12456

You should use the replace in simple quotes:

string.replace('\\','')

Good luck!

Upvotes: 1

Related Questions