mnmmeng
mnmmeng

Reputation: 37

Replace single quotes with double quotes but leave ones within double quotes untouched

The ultimate goal or the origin of the problem is to have a field compatible with in json_extract_path_text Redshift.

This is how it looks right now:

{'error': "Feed load failed: Parameter 'url' must be a string, not object", 'errorCode': 3, 'event_origin': 'app', 'screen_id': '6118964227874465', 'screen_class': 'Promotion'}

To extract field I need from the string in Redshift, I replaced single quotes with double quotes. The particular record is giving error because inside value of error, there is a single quote there. With that, the string will be a invalid json if those get replaced as well.

So what I need is:

{"error": "Feed load failed: Parameter 'url' must be a string, not object", "errorCode": 3, "event_origin": "app", "screen_id": "6118964227874465", "screen_class": "Promotion"}

Upvotes: 2

Views: 6076

Answers (2)

Wolfgang Fahl
Wolfgang Fahl

Reputation: 15769

I tried a regex approach but found it to complicated and slow. So i wrote a simple "bracket-parser" which keeps track of the current quotation mode. It can not do multiple nesting you'd need a stack for that. For my usecase converting str(dict) to proper JSON it works:

example input: {'cities': [{'name': "Upper Hell's Gate"}, {'name': "N'zeto"}]}

example output: {"cities": [{"name": "Upper Hell's Gate"}, {"name": "N'zeto"}]}'

python unit test

def testSingleToDoubleQuote(self):
        jsonStr='''
        {
            "cities": [
            {
                "name": "Upper Hell's Gate"
            },
            {
                 "name": "N'zeto"
            }
            ]
        }
        '''
        listOfDicts=json.loads(jsonStr)
        dictStr=str(listOfDicts)   
        if self.debug:
            print(dictStr)
        jsonStr2=JSONAble.singleQuoteToDoubleQuote(dictStr)
        if self.debug:
            print(jsonStr2)
        self.assertEqual('''{"cities": [{"name": "Upper Hell's Gate"}, {"name": "N'zeto"}]}''',jsonStr2)

singleQuoteToDoubleQuote

    def singleQuoteToDoubleQuote(singleQuoted):
            '''
            convert a single quoted string to a double quoted one
            Args:
                singleQuoted(string): a single quoted string e.g. {'cities': [{'name': "Upper Hell's Gate"}]}
            Returns:
                string: the double quoted version of the string e.g. 
            see
               - https://stackoverflow.com/questions/55600788/python-replace-single-quotes-with-double-quotes-but-leave-ones-within-double-q 
            '''
            cList=list(singleQuoted)
            inDouble=False;
            inSingle=False;
            for i,c in enumerate(cList):
                #print ("%d:%s %r %r" %(i,c,inSingle,inDouble))
                if c=="'":
                    if not inDouble:
                        inSingle=not inSingle
                        cList[i]='"'
                elif c=='"':
                    inDouble=not inDouble
            doubleQuoted="".join(cList)    
            return doubleQuoted

Upvotes: 4

Jan
Jan

Reputation: 43169

Several ways, one is to use the regex module with

"[^"]*"(*SKIP)(*FAIL)|'

See a demo on regex101.com.


In Python:

import regex as re

rx = re.compile(r'"[^"]*"(*SKIP)(*FAIL)|\'')
new_string = rx.sub('"', old_string)

With the original re module, you'd need to use a function and see if the group has been matched or not - (*SKIP)(*FAIL) lets you avoid exactly that.

Upvotes: 2

Related Questions