Reputation: 4167
I'm currently using Node to scrape a blog that stores selected data in a JSON file. When scraping a blog post that contains an embedded track from Soundcloud I seem to only be able to collect the iframe src and not that actual track link (either soundcloud link or stream link).
When I scrape the iframe src url I seem to only be able to get a link that's in the following format: https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/120261008&color=000000&auto_play=false&show_artwork=false
If I'm not able to scrape the track URL is there a way I'm able to manipulate how the above link is stored into the array? In order for this link to be usable I need to only store the url=https%3A//api.soundcloud.com/tracks/120261008 (minus the url=).
But then the problem with this is that the %3A needs replacing to a :
What's the best way to manipulate the url to achieve the desired output url either when it's being stored or when it's being called?
Upvotes: 1
Views: 1363
Reputation: 17168
I'm not exactly sure what you're planning on doing with the track URL once you have it, but to get the permalink URL for a track/playlist your going to need a two step approach. First you're going to need to parse the url
parameter in the query string in the iframe src:
CLIENT_ID = 'client_id=b45b1aa10f1ac2941910a7f0d10f8e28';
var src = 'https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/120261008&color=000000&auto_play=false&show_artwork=false',
match = src.match(/url=([^&]*)/),
resource = match[0],
stream = decodeURIComponent(match[1])+'/stream/?'+CLIENT_ID;
Then you're going to need to make an HTTP request to SoundCloud's resolve API to actually convert that resource into the permalink URL:
var url = 'http://api.soundcloud.com/resolve.json?'+resource+'&'+CLIENT_ID;
var xhr = new XMLHttpRequest();
xhr.open('GET', url, true);
xhr.onload = function(){
var data = JSON.parse(xhr.responseText);
// do something with the data
console.log(data.permalink_url);
};
xhr.send();
Upvotes: 2