Code Review Asked by User00257 on December 14, 2020
My objective is to find out on what other subreddit users from r/(subreddit) are posting on; you can see my code below. It works pretty well, but I am curious to know if I could improve it by:
First, restricting my code so that it only considers users only once (i.e. not collect the posting history twice for the same user) and, secondly, by adding a minimum of 5 posts per user before extracting his/her info (i.e. if the user wrote less than 5 posts in his reddit life, my code would not consider him).
import praw
import pandas as data
import datetime as time
reddit = praw.Reddit(client_id = 'XXXX',
client_secret = 'XXXX',
username = 'XXXX',
password = 'XXXX',
user_agent = 'XXXX')
collumns = { "User":[], "Subreddit":[], "Title":[], "Description":[], "Timestamp":[]}
for submission in reddit.subreddit("ENTER SUBREDDIT").new(limit=100):
user = reddit.redditor('{}'.format(submission.author))
for sub in user.submissions.new(limit=100):
collumns["User"].append(sub.author)
collumns["Subreddit"].append(sub.subreddit)
collumns["Title"].append(sub.title)
collumns["Description"].append(sub.selftext)
collumns["Timestamp"].append(sub.created)
collumns_data = data.DataFrame(collumns)
def get_date(Timestamp):
return time.datetime.fromtimestamp(Timestamp)
_timestamp = collumns_data["Timestamp"].apply(get_date)
collumns_data = collumns_data.assign(Timestamp = _timestamp)
collumns_data.to_csv('DataExport.csv')
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP