Ian Dragulet
Ian Dragulet

Reputation: 23

encoding utf-8 not working when reading in json file

I am trying to read some names from a json file that include special characters. Unfortunately when I use encoding utf-8 in json.load, it still does not read in the special characters.

def player_matrix(player_file): 
    with open(player_file) as f:
        data = json.load(f, encoding='utf-8')
    all_players = pd.DataFrame(data)
    
    player_dataset = pd.DataFrame(columns=['player_id','name','short name', 'nation', 'team_id' ])
    
    for index,player in all_players.iterrows():
        player_dataset.at[index,'player_id']=player['wyId']
        player_dataset.at[index,'name'] =  str(player['firstName'])+' '+str(player['lastName'])
        player_dataset.at[index,'short name'] =  player['shortName']
        player_dataset.at[index,'nation'] =  player['currentNationalTeamId']
        player_dataset.at[index,'team_id'] =  player['currentTeamId']

    return player_dataset

players_df = player_matrix(playerfile)
players_df

I still get something like this for a row output:

3598    120839  Ali Ma\u00e2loul    A. Ma\u00e2loul null    16041

what can I do to read in these special characters?

Upvotes: 1

Views: 65

Answers (1)

ddejohn
ddejohn

Reputation: 8962

The special characters are being read in, but what you're seeing is the unicode representation of the data, not the characters as they appear when printed:

>>> print("3598    120839  Ali Ma\u00e2loul    A. Ma\u00e2loul null    16041")
3598    120839  Ali Maâloul    A. Maâloul null    16041

Upvotes: 1

Related Questions