RonRon
RonRon

Reputation: 11

Unable to parse RSS feed using python, but other RSS feed apps in chrome can parse data

I'm working on a basic python script that parses RSS Feed data from the SEC.gov website, but it fails when I run the script. Where am I going wrong?

The version of Python I'm using is 3.6.5, and I've tried using the libraries Atoma and feedparser, but I'm unable to pull any SEC RSS data successfully. To be honest it could be that the format of the rss feed data is not in a valid format(I checked https://validator.w3.org/feed/ and it shows that the data is invalid). But when I try the same line in a Google Chrome RSS feed extension it works, so I must be doing something wrong. Does anyone know how to fix the issue with the format or am I going about it in the wrong way in Python?

import atoma, requests

feed_name = "SEC FEED"
url ='https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0001616707&type=&dateb=&owner=exclude&start=0&count=100&output=atom'
response = requests.get(url)
feed = atoma.parse_rss_bytes(response.content)

for post in feed.items:
  date = post.pub_date.strftime('(%Y/%m/%d)')
  print("post date: " + date)
  print("post title: " + post.title)
  print("post link: " + post.link)

Upvotes: 1

Views: 802

Answers (1)

Rolf Carlson
Rolf Carlson

Reputation: 883

Here is another way to solve the problem in Python:

import requests
import feedparser
import datetime

feed_name = "SEC FEED"
url ='https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0001616707&type=&dateb=&owner=exclude&start=0&count=100&output=atom'
response = requests.get(url)
feed = feedparser.parse(response.content)

for entry in feed['entries']:
    dt = datetime.datetime.strptime(entry['filing-date'], '%Y-%m-%d')
    print('Date: ', dt.strftime('(%Y/%m/%d)'))
    print('Title: ', entry['title'])
    print(entry['link'])
    print('\n')

There was no pub_date field at the url, but you could use filing-date or choose a different date. You should get an output that looks like:

Date: (2021/03/11) Title: 8-K - Current report https://www.sec.gov/Archives/edgar/data/1616707/000161670721000075/0001616707-21-000075-index.htm

Date: (2021/02/25) Title: S-8 - Securities to be offered to employees in employee benefit plans https://www.sec.gov/Archives/edgar/data/1616707/000161670721000066/0001616707-21-000066-index.htm

Date: (2021/02/25) Title: 10-K - Annual report [Section 13 and 15(d), not S-K Item 405] https://www.sec.gov/Archives/edgar/data/1616707/000161670721000064/0001616707-21-000064-index.htm

Upvotes: 1

Related Questions