Anarach
Anarach

Reputation: 458

Stripping string in python between quotes

I have a string

text:u'tsod'

The "text:U" is part of the string.

I want to strip only the characters between the single quotes. I know how to strip based on location but i do not know how to strip based on a value in my case the " ' " , I simply want to extract "tsod"

Also , how will python understand which is the starting " ' " and which is the ending " ' " in order to perform the strip since both are same characters.

Upvotes: 2

Views: 5543

Answers (3)

Giovanni
Giovanni

Reputation: 99

From https://stackoverflow.com/a/69891301/1531728

Test with the loop/enumeration approach [Booboo2020], and modified regular expression from [Avinash2021] and [user17405772021] for substrings embedded within single quotes.

Swap the positions of the double quotes and the single quotes for the modification.

References:

  • [Booboo2020]
  • [Avinash2021]
  • [user17405772021]

My solution is:

import re
my_substrings = []
test_string = "text:u'tsod'"
for values in re.findall("'(.+?)'", test_string):
    my_substrings.append(values)
print(" my_substrings are:",my_substrings,"=")

Solution tested more extensively.

import re
my_substrings = []
my_strings = ["SetVariables 'a' 'b' 'c' ", "d2efw   f 'first' +&%#$%'second',vwrfhir, d2e   u'third' dwedew", "'uno'?>P>MNUIHUH~!@#$%^&*()_+=0trewq'due'        'tre'fef    fre f", "       'uno''dos'      'tres'", "'unu''doua''trei'", "      'um'                    'dois'           'tres'                  "]
for current_test_string in my_strings:
    """
        For each test string, extract substrings embedded within
            single quotes.
    """
    for values in re.findall("'(.+?)'", current_test_string):
        # Append found embedded substring into the list of substrings.
        my_substrings.append(values)
    print(" my_substrings are:",my_substrings,"=")
    my_substrings = []

Alternate regular expressions to use are:

  • re.findall('"(.+?)"', current_test_string) [Avinash2021] [user17405772021]
  • re.findall("'(.*?)'", current_test_string) [Shelvington2020]
  • re.findall(r"'(.+?)'", current_test_string) [Lundberg2012] [Avinash2021]
  • re.findall(r"'(.*?)'", current_test_string) [Lundberg2012] [Avinash2021]
  • re.findall(r"'[']", current_test_string) [Muthupandi2019]
  • re.findall(r"'([^']*)'", current_test_string) [Pieters2014]
  • re.findall(r"'(?:(?:(?!(?<!\)').)*)'", current_test_string) # Causes double quotes to remain in the strings, but can be removed via other means. [Booboo2020]
  • re.findall(r"'(.*?)(?<!\)'", current_test_string) [Hassan2014]
  • re.findall("'[^']*'", current_test_string) # Causes double quotes to remain in the strings, but can be removed via other means. [Martelli2013]
  • re.findall("'([^']*)'", current_test_string) [jspcal2014]
  • re.findall("'(.*?)'", current_test_string) [akhilmd2016]

The current_test_string.split("\"") approach works if the strings have patterns in which substrings are embedded within quotation marks. This is because it uses the double quotation mark in this example as a delimiter to tokenize the string, and accepts substrings that are not embedded within double quotation marks as valid substring extractions from the string.

References:

Upvotes: -1

akhilmd
akhilmd

Reputation: 182

If you have multiple pairs of quotes, then this solution may help:

import re
strng = "text:u'tsod';text2:u'tsod2';text3:u'tsod3'"
qlist = re.findall("\'(.*?)\'",strng)

Then qlist will have : ['tsod', 'tsod2', 'tsod3']

Upvotes: 3

Moses Koledoye
Moses Koledoye

Reputation: 78554

You can split on the inverted comma "'":

>>> s = "text:u'tsod'".split("'")
>>> s
['text:u', 'tsod', '']
>>> s[1]
'tsod'

Upvotes: 2

Related Questions