Reputation: 4122
I am having an issue joining a string that I have already decoded earlier in my code:
import json
import requests
import jsonobject
for i in range(0, 3): #for loop to feed parameter to url params
if i == 0:
var = "0"
var2 = "Home"
elif i == 1:
var = "1"
var2 = "Away"
elif i == 2:
var = "2"
var2 = "Overall"
url = 'http://www.whoscored.com/StatisticsFeed/1/GetPlayerStatistics'
params = {
'category': 'tackles',
'subcategory': 'success',
'statsAccumulationType': '0',
'isCurrent': 'true',
'playerId': '',
'teamIds': '',
'matchId': '',
'stageId': '9155',
'tournamentOptions': '2',
'sortBy': 'Rating',
'sortAscending': '',
'age': '',
'ageComparisonType': '',
'appearances': '',
'appearancesComparisonType': '0',
'field': var2, #from for loop
'nationality': '',
'positionOptions': "'FW','AML','AMC','AMR','ML','MC','MR','DMC','DL','DC','DR','GK','Sub'",
'timeOfTheGameEnd': '5',
'timeOfTheGameStart': '0',
'isMinApp': '',
'page': '1',
'includeZeroValues': '',
'numberOfPlayersToPick': '10'
}
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36',
'X-Requested-With': 'XMLHttpRequest',
'Host': 'www.whoscored.com',
'Referer': 'http://www.whoscored.com/'}
responser = requests.get(url, params=params, headers=headers)
responser = responser.json()
playerTableStats = responser[u'playerTableStats']
for statDict in playerTableStats:
mylookup = ("{name},{firstName},{lastName},{positionText},{tournamentId},{tournamentShortName},{regionCode}"
"{tournamentRegionId},{seasonId},{seasonName},{teamName},{teamId},{playerId}"
"{minsPlayed},{ranking},{rating:.2f},{apps},{weight:.2f},{height:.2f},{playedPositions}"
"{isManOfTheMatch},{isOpta},{subOn},".decode('cp1252').format(**statDict)) #generates none match data about players
print mylookup
mykey2 = (var2)
print mykey2
mykey3 = {}
#create dynamic variables and join match and none match data together
mykey3[mykey2] = ("{challengeLost:.2f},{tackleWonTotal:.2f},{tackleTotalAttempted:.2f},".decode('cp1252').format(**statDict))
print mykey3[mykey2]
mykey3[mykey2] = mykey3[mykey2],'*,'
mykey3[mykey2] = str(''.join(mykey3[mykey2][0:2]))
mykey3[mykey2] = mylookup,mykey3[mykey2]
mykey3[mykey2] = str(''.join(mykey3[mykey2][0:2]))
print mykey3[mykey2]
mykey3[mykey2] = mykey3[mykey2],'*,'
mykey3[mykey2] = str(''.join(mykey3[mykey2][0:2]))
I get an error that says:
Traceback (most recent call last):
File "C:\Python27\counter.py", line 72, in <module>
mykey3[mykey2] = str(''.join(mykey3[mykey2][0:2]))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position 6: ordinal not in range(128)
when the name Cesc Fàbregas
is encountered in the list of player names being cycled though. I have tried amending the above code to:
mykey3[mykey2] = mykey3[mykey2],'*,'
mykey3[mykey2] = str(''.join(mykey3[mykey2][0:2]).decode('cp1252'))
...or:
mykey3[mykey2] = mykey3[mykey2],'*,'
mykey3[mykey2] = str(''.join(mykey3[mykey2][0:2])).decode('cp1252')
...however this is still generating the same error....
Can anyone see what I am doing wrong?
Upvotes: 1
Views: 58
Reputation: 1121276
You are trying to join two values with a comma in a very roundabout way, by creating a tuple then turning the tuple back into a string. Don't do that, just use string formatting.
You need to use Unicode literals rather than decode your strings:
mykey3[mykey2] = u"{challengeLost:.2f},{tackleWonTotal:.2f},{tackleTotalAttempted:.2f},".format(**statDict)
Note the u
prefix on the string. You are not actually using any non-ASCII characters in your string literals, so you don't even need to declare an encoding there.
But your use of tuples then using str()
on those is causing your exceptions. Just don't use str()
here at all; you are trying to convert Unicode strings joined together into a byte string again, after which you are trying to join that byte string back with a Unicode value, and convert to a byte string again, which failed:
>>> mylookup = ("{name},{firstName},{lastName},{positionText},{tournamentId},{tournamentShortName},{regionCode}"
... "{tournamentRegionId},{seasonId},{seasonName},{teamName},{teamId},{playerId}"
... "{minsPlayed},{ranking},{rating:.2f},{apps},{weight:.2f},{height:.2f},{playedPositions}"
... "{isManOfTheMatch},{isOpta},{subOn},".decode('cp1252').format(**statDict))
>>> ''.join(mykey3[mykey2][0:2])
u'Cesc F\xe0bregas,Cesc,F\xe0bregas,Midfielder,2,EPL,es252,4311,2014/2015,Chelsea,15,8040532,5,8.09,6,74.00,175.00,-FW-MC-ML-MR-False,True,0,2.83,1.17,4.00,*,*,2.83,1.17,4.00,*,'
>>> str(''.join(mykey3[mykey2][0:2]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe0' in position 6: ordinal not in range(128)
Note that the join happened to work; it was the str()
call that converts the Unicode back to a byte string without an explicit codec.
The following also joins two (Unicode) strings with a comma:
mykey3[mykey2] = u','.join(mykey3[mykey2], u'*,')
or just append to the existing string:
mykey3[mykey2] += u',*,'
or just use one string formatting operation to put all your data into one string to begin with:
mylookup = (
u"{name},{firstName},{lastName},{positionText},{tournamentId},{tournamentShortName},{regionCode}"
u"{tournamentRegionId},{seasonId},{seasonName},{teamName},{teamId},{playerId}"
u"{minsPlayed},{ranking},{rating:.2f},{apps},{weight:.2f},{height:.2f},{playedPositions}"
u"{isManOfTheMatch},{isOpta},{subOn},"
u"{challengeLost:.2f},{tackleWonTotal:.2f},{tackleTotalAttempted:.2f},"
u"*,*,".format(**statDict)
)
Upvotes: 1