Reputation: 59
I already have Listbox full with URLs like this I convert them to String
http://example.com/1392/Music/1392/Shahrivar/21/Avicii%20-%20True/01.%20Avicii%20Ft.%20Aloe%20Blacc%20-%20Wake%20Me%20Up%20(CDQ)%20%5b320%5d.mp3 and I wanna extract for example on this link Name of Song: "Avicii Ft Aloe Blacc -Wake Me Up " I'm using c# I already extract links from a web page and now I only need to extract names from links. thanks already for any suggestions or help.
Upvotes: 0
Views: 143
Reputation: 389
First of all, use HttpUtility.DecodeUrl
. This function will decode HTML special chars, leaving a plain string to work with. You can then simply split by /
.
Upvotes: 0
Reputation: 5884
Try this:
using System;
using System.Linq;
using System.Net;
namespace ConsoleApplication1
{
class Program
{
static void Main (string[] args)
{
var url = "http://example.com/1392/Music/1392/Shahrivar/21/Avicii%20-%20True/01.%20Avicii%20Ft.%20Aloe%20Blacc%20-%20Wake%20Me%20Up%20(CDQ)%20%5b320%5d.mp3";
var uri = new Uri (url);
string[] segments = uri.Segments.Select (x => WebUtility.UrlDecode (x).TrimEnd ('/')).ToArray ();
}
}
}
Upvotes: 1
Reputation: 16
If you know the structure of the URL you are scraping you should be able to break-off the useless part of the string.
For example, if you know that the URL follows the form: http://example.com/1392/Music/1392/Shahrivar/21/{Artist}-{Album}/{Track Information}
Roughly, I think the following would allow you to extract the information you want from a single link:
void Main (string[] args)
{
var example = @"http://example.com/1392/Music/1392/Shahrivar/21/Avicii%20-%20True/01.%20Avicii%20Ft.%20Aloe%20Blacc%20-%20Wake%20Me%20Up%20(CDQ)%20%5b320%5d.mp3";
var parts = example.split('/');
var album = parts[7];
var trackInfo = parts[8];
var trackParts = trackInfo.split('-');
var artist = trackParts[0];
var trackTitle = trackParts[1];
Console.WriteLine(trackTitle);
}
Here I am splitting the URL by '/', which is a messy solution, but for a simple case, it works. Then I am finding the index within the tokenized string where the desired information can be found. once I have the track information, I know the convention is to separate the Artist from the Title by a '-', so I split again and then have both artist and title.
You can refactor this into a method which takes the URL, and returns an object containing the Artist and song title. After that, you might want to use a string.Replace on the '%20' with ' '.
Upvotes: 0