python backslash replacement failing

Question

I'm pulling json data in from a large data file to convert the contents to csv format and I'm getting an error:

Traceback (most recent call last):
  File "python/gamesTXTtoCSV.py", line 99, in 
    writer.writerow(foo)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 15: ordinal not in range(128)

After some digging I've found that, the string "\u2013" shows up in the json data file.

Example (see the value field):

"states":[
      {
         "display":null,
         "name":"choiceText",
         "type":"string",
         "value":"Show me around \u2013 as long as your friends don't chase me away again!"
      },

I've tried various methods of string replacement to the script to get rid of the offending string.

Stuff like (where i[value] is the offending field:

 i['value'].replace("\u2013", "--")

Or

i['value'].replace("\", "") #this one is the last resort

Or even

i['value'].encode("utf8")

But to no avail - I keep getting the error. Any idea what's going on?

Here's the section of code that writes the csv, in case additional context is needed:

################## filling out the csv ################
openfile= open(inFile)
f = open(outFile, 'wt')
writer = csv.writer(f)
writer.writerow(all_cols)

for row in openfile.readlines():
    line = json.loads(row)
    stateCSVrow= []
    states=line['states']
    contexts=line['context']
    contextCSVrow=[]
    k = 0
    for state in state_names:
        for i in states:
            if i['name']==state:
                i['value'].replace("\u2019", "'") ####THE SECTION GIVING ISSUE
                i['value'].replace("\u2013", "--")
                stateCSVrow.append(i['value'])
        if len(stateCSVrow)==k:
            stateCSVrow.append('NA')
        k +=1
    c = 0
    for context in context_names:
        for i in contexts:
            if i['name']==context:
                contextCSVrow.append(i['value'])
        if len(contextCSVrow)==c:
            contextCSVrow.append('NA')
        c +=1
    first=[]
    first.extend([
        line['key'] ,
        line['timestamp'],
        line['actor']['actorType'],
        line['user']['username'],
        line['version'],
        line['action']['name'],
        line['action']['actionType']
          ])

    foo = first + stateCSVrow + contextCSVrow
    writer.writerow(foo)

python backslash replacement failing

Answers (1)

Related Questions