Rachel Park
Rachel Park

Reputation: 47

Can't get the data I want on Python

I'm Python user with Spyder. I want to convert the data(tweets) from notepad, and put out the converted data to other notepad. Its code is like that. It will make simple data such as {created at: date, user_name, unicode..} -> user_name, data

try:
    import json
except ImportError:
    import simplejson as json

tweets_filename = 'C:/Users/siri_0.txt' #unconverted data
tweets_file = open(tweets_filename, "r")

for line in tweets_file:
    try:
        tweet = json.loads(line.strip())
        if 'text' in tweet: 
            print (tweet['id']) 
            print (tweet['created_at']) 
            print (tweet['text']) 
            print (tweet['user']['id']) 
            print (tweet['user']['name']) 
            print (tweet['user']['screen_name']) 
            hashtags = []
            for hashtag in tweet['entities']['hashtags']:
                hashtags.append(hashtag['text'])
            print(hashtags)

            output = "C:/Users/fn_siri.txt"
            #I want to put the converted data here.
            out_file = open(output, 'a')
            out_file.write(line)
            out_file.close()

    except:
        continue

Unfortunately, C:/Users/fn_siri.txt can contain only 'unconverted data'. How can I change the code for containing the converted data?

Upvotes: 1

Views: 74

Answers (2)

Louise Davies
Louise Davies

Reputation: 15941

You are writing out line to your output file, which is your unconverted input, rather than writing only the data you want.

So, if you want to write out the username, followed by a comma, followed by e.g. the text, you need to replace your out_file.write(line) with:

out_file.write(tweet['user']['name'] + "," + tweet['text'] + "\n")

You need the \n at the end to make sure it has a new line after every line of data

Upvotes: 1

majin
majin

Reputation: 684

try:
    import json
except ImportError:
    import simplejson as json

tweets_filename = 'C:/Users/siri_0.txt' #unconverted data
tweets_file = open(tweets_filename, "r")
for line in tweets_file:
    try:
        tweet = json.loads(line.strip())
        out_file = open(output, 'a')
        if 'text' in tweet: 
            print (tweet['id'],) 
            print (tweet['created_at']) 
            print (tweet['text']) 
            print (tweet['user']['id']) 
            print (tweet['user']['name']) 
            print (tweet['user']['screen_name']) 
            hashtags = []
            for hashtag in tweet['entities']['hashtags']:
                hashtags.append(hashtag['text'])
            output = "C:/Users/fn_siri.txt"
            print(hashtags,file=out_file)
            #I am assuming the converted data you want to write to out_file is hashtags
            #out_file.write(line)# why are you writing old data here ...
            out_file.close()
    except:
        continue

Upvotes: 1

Related Questions