Reputation: 4117
When I put a twitter feed (https://api.twitter.com/1/statuses/user_timeline.rss?screen_name=chulian1819) into yahoo pipes, I get an error 400, and when I use the YQL console it says "Redirected to a robots.txt restricted URL: https://api.twitter.com/1/statuses/user_timeline.rss?screen_name=chulian1819"
http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22https%3A%2F%2Fapi.twitter.com%2F1%2Fstatuses%2Fuser_timeline.rss%3Fscreen_name%3Dchulian1819%22&diagnostics=true
how to get the twitter feed of a user into yahoo pipes?
Thanks!
ps: my twitter post are not protected, i can se the rss feed on my browser and not logged into twitter
Upvotes: 0
Views: 1438
Reputation: 31
Hi there i was able to make a twitter feed mix using yahoo! pipes I tried alot of different other "programs" but Yahoo! pipes just rules this one ;)
I used Fetch Feed, Sort and Regex to do my thing.
Folowing details are maybe interesting for other people
the url you can fetch from
http://api.twitter.com/1/statuses/user_timeline.rss?screen_name=REPLACEWITHNAME
http://api.twitter.com/1/statuses/user_timeline.rss?screen_name=REPLACEWITHOTHERNAME ...
sort by item.pubDate to get a mix of feeds by date
and i use regex to remove url's in the text (https?://([-\w.]+)+(:\d+)?(/([\w/_.]*(\?\S+)?)?)?)
probably there are pre-made yahoo pipes that are public and that you can simply clone and adapt, but i haven't looked into that so maybe someone else can post about that
anyway hope it helps
Upvotes: 3
Reputation: 646
When Yahoo Pipes retrieves content from either an RSS feed or even a web page it identifies itself using the User Agent String in the request header, this is fixed by Yahoo and cannot be changed. So if the site being scraped has blocked yahoo pipes then you are out of luck and it cannot be done.
The only workaround is to change over to using cURL, this can mimic a web browsers userAgentstring and bypass the robots.txt file. However this will mean using a PHP enabled webserver or a google app engine to grab the feed.
Upvotes: 2