shankaran
shankaran

Reputation: 13

How to use Hacker News API in Python?

Hacker News has released an API, how do I use it in Python?

I want get all the top posts. I tried using urllib, but I don't think I am doing right.

here's my code:

import urllib2
response = urllib2.urlopen('https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty')
html = response.read()
print response.read()

It just prints empty

''

I missed a line, had updated my code.

Upvotes: 0

Views: 3633

Answers (2)

avi
avi

Reputation: 9636

As @jonrsharpe, explained read() is only one time operation. So if you print html, you will get list of all ids. And if you go through that list, you have to make each request again to get story of each id.

First you have to convert the received data to python list and go through them all.

base_url =  'https://hacker-news.firebaseio.com/v0/item/{}.json?print=pretty'
top_story_ids = json.loads(html)
for story in top_story_ids:
    response = urllib2.urlopen(base_url.format(story))
    print response.read()

Instead of all this, you could use haxor, it's a Python wrapper for Hacker News API. Following code will fetch you all the ids of top stories :

from hackernews import HackerNews
hn = HackerNews()
top_story_ids = hn.top_stories()
# >>> top_story_ids
# [8432709, 8432616, 8433237, ...]

Then you can go through that loop and print all them, for example:

for story in top_story_ids:
   print hn.get_item(story)

Disclaimer: I wrote haxor.

Upvotes: 5

jonrsharpe
jonrsharpe

Reputation: 122137

You should

print html

instead of

print response.read()

Why? Because the read is a one-time operation; after you've done it, you can't repeat it:

>>>import ullrib2
>>> response = urllib2.urlopen('https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty')
>>> response.read()
'[ 8445087, 8444739, 8444603, 8443981, 8444976, 8443902, 8444252, 8444634, 8444931, 8444272, 8444025, 8441939, 8444510, 8444640, 8443830, 8445076, 8443470, 8444785, 8443028, 8444077, 8444832, 8443841, 8443467, 8443309, 8443187, 8443896, 8444971, 8443360, 8444601, 8443287, 8441095, 8441681, 8441055, 8442712, 8444909, 8443621, 8442596, 8443836, 8442266, 8443298, 8445122, 8443096, 8441699, 8442119, 8442965, 8440486, 8442093, 8443393, 8442067, 8444989, 8440985, 8444622, 8438728, 8442555, 8444880, 8442004, 8443185, 8444370, 8436210, 8437671, 8439641, 8443727, 8441702, 8436309, 8441041, 8437367, 8422087, 8441711, 8438063, 8444212, 8439408, 8442049, 8440989, 8439367, 8438515, 8437403, 8435278, 8442486, 8442730, 8428522, 8438904, 8443450, 8432703, 8430412, 8422928, 8443635, 8439267, 8440191, 8439560, 8437230, 8442556, 8439977, 8444140, 8441682, 8443776, 8441209, 8428632, 8441388, 8422599, 8439547 ]\n'
>>> response.read()
''

In your case, though, you've assigned the string from read to the name html, so you can still access it.


Once you have the story IDs, you can access each one via '.../v0/item/{item number}.json?print=pretty':

>>> response = urllib2.urlopen('https://hacker-news.firebaseio.com/v0/item/8445087.json?print=pretty')
>>> print response.read()
{
  "by" : "lalmachado",
  "id" : 8445087,
  "kids" : [ 8445205, 8445195, 8445173, 8445103 ],
  "score" : 21,
  "text" : "",
  "time" : 1413116430,
  "title" : "Show HN: Powerful ASCII art editor designed for the Mac",
  "type" : "story",
  "url" : "http://monodraw.helftone.com/"
}

You should read through the API documentation before continuing. It's also worth getting to grips with the json module.

Upvotes: 1

Related Questions