Reputation: 11
I'm looking to pull all the comments from a reddit post and ultimately get the author name, comment, and upvotes into a dataframe. I'm fairly new to programming so I'm having a tough time..
Right now I'm pulling the stickied comment using PRAW and trying to use a for loop to iterate through the comments and create a list of dictionaries with the author and comment. For some reason it's only adding the first author comment dictinoary pairing to the list and repeating it. Here's what I have:
import praw
import pandas as pd
import pprint
reddit = praw.Reddit(xxx)
sub = reddit.subreddit('ethtrader')
hot_python = sub.hot(limit=1)
for submissions in hot_python:
if submission.stickied:
print('Title: {}, ups: {}, downs: {}'.format(submissions.title, submissions.ups,submissions.downs))
post = {}
postlist = []
submission.comments.replace_more(limit=0)
for comment in submission.comments:
post['Author'] = comment.author
post['Comment'] = comment.body
postlist.append(post)
Any ideas? Apologies for the ugly code I'm a novice here. Thanks!
Upvotes: 1
Views: 1452
Reputation: 139
for submissions in hot_python:
if submission.stickied:
print('Title: {}, ups: {}, downs: {}'.format(submissions.title, submissions.ups,submissions.downs))
postlist = []
submission.comments.replace_more(limit=0)
for comment in submission.comments:
post = {} # put this here
post['Author'] = comment.author
post['Comment'] = comment.body
postlist.append(post)
You should declare a new post
dict inside the for
loop, because when you append it to the list, you're actually appending a reference to the post
dict, and then you change the same dict with the new data and it changes for all references to that dict. Your list at the end is just a list of references to the same dict.
Upvotes: 1