ctagg11
ctagg11

Reputation: 11

Pulling reddit comments using python PRAW and creating a dataframe with the results

I'm looking to pull all the comments from a reddit post and ultimately get the author name, comment, and upvotes into a dataframe. I'm fairly new to programming so I'm having a tough time..

Right now I'm pulling the stickied comment using PRAW and trying to use a for loop to iterate through the comments and create a list of dictionaries with the author and comment. For some reason it's only adding the first author comment dictinoary pairing to the list and repeating it. Here's what I have:

import praw
import pandas as pd
import pprint

reddit = praw.Reddit(xxx)
sub = reddit.subreddit('ethtrader')
hot_python = sub.hot(limit=1)



for submissions in hot_python:
    if submission.stickied:
        print('Title: {}, ups: {}, downs: {}'.format(submissions.title, submissions.ups,submissions.downs))
        post = {}
        postlist = []                                                 
        submission.comments.replace_more(limit=0)
        for comment in submission.comments: 
            post['Author'] = comment.author
            post['Comment'] = comment.body
            postlist.append(post)

Any ideas? Apologies for the ugly code I'm a novice here. Thanks!

Upvotes: 1

Views: 1452

Answers (1)

matheussilvapb
matheussilvapb

Reputation: 139

for submissions in hot_python:
    if submission.stickied:
        print('Title: {}, ups: {}, downs: {}'.format(submissions.title, submissions.ups,submissions.downs))
        postlist = []                                                 
        submission.comments.replace_more(limit=0)
        for comment in submission.comments: 
            post = {} # put this here
            post['Author'] = comment.author
            post['Comment'] = comment.body
            postlist.append(post)

You should declare a new post dict inside the for loop, because when you append it to the list, you're actually appending a reference to the post dict, and then you change the same dict with the new data and it changes for all references to that dict. Your list at the end is just a list of references to the same dict.

Upvotes: 1

Related Questions