fosho
fosho

Reputation: 1676

How to solve/what is a KeyError in Python/Pandas?

I have two text files that I wish to work with using Pandas. The files were created in the exact same way and are very similar, except for some of the content inside. However, my program does not work with one of the text files, but does work with the other. Here is my error:

Traceback (most recent call last):
  File "E:\Holiday Project\Politic\store.py", line 19, in <module>
    tweets['text'] = list(map(lambda tweet: tweet['text'], tweets_data))
  File "E:\Holiday Project\Politic\store.py", line 19, in <lambda>
    tweets['text'] = list(map(lambda tweet: tweet['text'], tweets_data))
KeyError: 'text'

and here is my code:

import json
import pandas as pd
from textblob import TextBlob

tweets_data_path = 'filename.txt'

tweets_data = []
tweets_file = open(tweets_data_path, "r")
for line in tweets_file:
    try:
        tweet = json.loads(line)
        tweets_data.append(tweet)
    except:
        continue

print (len(tweets_data))

tweets = pd.DataFrame()
tweets['text'] = list(map(lambda tweet: tweet['text'], tweets_data))
tweets['lang'] = list(map(lambda tweet: tweet['lang'], tweets_data))
tweets['country'] = list(map(lambda tweet: tweet['place']['country'] if tweet['place'] != None else None, tweets_data))
avg = 0
for lol in tweets['text']:
    tweet = TextBlob(text)
    avg = tweet.sentiment.polarity + avg
avg = avg/len(tweets)
print(avg)

Upvotes: 6

Views: 14610

Answers (3)

Dror Hilman
Dror Hilman

Reputation: 7447

look on these lines:

tweets = pd.DataFrame()
tweets['text'] = list(map(lambda tweet: tweet['text'], tweets_data))

You probably try to extract tweet['text'] which does not exist in some of the dictionaries. If the "text" field exists in only some of the lines you are loading, than you may want to write something like that:

tweets = pd.DataFrame()
tweets['text'] = [tweet.get('text','') for tweet in tweets_data]
tweets['lang'] = [tweet.get('lang','') for tweet in tweets_data]
#and so on...

If for some reason, in some of the jsons "text" do not exists, you will get ' ' in the DataFrame.

Upvotes: 1

Ramin Taghizada
Ramin Taghizada

Reputation: 125

You should add the same condition as you did for the last category to skip "None text" case .

Upvotes: 0

Jeff H.
Jeff H.

Reputation: 330

tweet['text'] does not seem to exist. A key error is generated when you try to access a key in a hash map/dictionary that does not exist. for example

myDict = {"hello": 1, "there": 2}
print myDict["hello"] #this prints 1
print myDict["friend"] #this will generate a key error because it does not exist

Upvotes: 2

Related Questions