KeyError when Key exists

daneshjai picture daneshjai · Jul 18, 2014 · Viewed 38.2k times · Source

Using python and twitter api to get tweet objects.

I have a file (tweetfile = a .txt file on my computer) with tweets and I'm trying to loop through the objects to get the text. I checked the twitter object with tweetObj.keys() to see the keys and 'text' is there; however, when I try to get the individual text using tweetObj['text'] I get the KeyError: 'text'

code:

for line in tweetfile:
    tweetObj = json.loads(line)
    keys =  tweetObj.keys()
    print keys
    tweet = tweetObj['text']
    print tweet

below is the output:

[u'contributors', u'truncated', u'text', u'in_reply_to_status_id', u'id', u'favorite_count', u'source', u'retweeted', u'coordinates', u'entities', u'in_reply_to_screen_name', u'id_str', u'retweet_count', u'in_reply_to_user_id', u'favorited', u'user', u'geo', u'in_reply_to_user_id_str', u'possibly_sensitive', u'lang', u'created_at', u'filter_level', u'in_reply_to_status_id_str', u'place']
@awe5sauce my dad was like "so u wanna be in a relationship with a 'big dumb idiot'" nd i was like yah shes the bae u feel lmao
[u'delete']
Traceback (most recent call last):
  File "C:\apps\droid\a1\tweets.py", line 34, in <module>
main()
  File "C:\apps\droid\a1\tweets.py", line 28, in main
    tweet = tweetObj['text']
KeyError: 'text'

I'm not sure how to approach since it looks like it prints one tweet. The question is why would this occur where the key exists and appears to return a value but not for all instances and how can I correct it to where I can access the value for all lines with that key?

Answer

ssm picture ssm · Jul 18, 2014

There are 2 dictionaries created within the loop, one for each line. The first one has text and the second one only has a 'delete' key. It does not have the 'text' key. Hence the error message.

Change it to:

for line in tweetfile:
    tweetObj = json.loads(line)
    keys =  tweetObj.keys()
    print keys
    if 'text' in tweetObj:
        print tweetObj['text']
    else:
        print 'This does not have a text entry'      

Just so you know, if you are only interested in the lines containing text, you may want to use

[ json.loads(l)['text'] for l in tweetfile if 'text' in json.loads(l) ]

or

'\n'.join([ json.loads(l)['text'] for l in tweetfile if 'text' in json.loads(l) ])

or even BETTER

[ json.loads(l).get('text') for l in tweetfile]