Elmissouri
Elmissouri

Reputation: 59

How extract Specific names from url?

I already have Listbox full with URLs like this I convert them to String

http://example.com/1392/Music/1392/Shahrivar/21/Avicii%20-%20True/01.%20Avicii%20Ft.%20Aloe%20Blacc%20-%20Wake%20Me%20Up%20(CDQ)%20%5b320%5d.mp3 and I wanna extract for example on this link Name of Song: "Avicii Ft Aloe Blacc -Wake Me Up " I'm using c# I already extract links from a web page and now I only need to extract names from links. thanks already for any suggestions or help.

Upvotes: 0

Views: 143

Answers (3)

Vladimir  Chikrizov
Vladimir Chikrizov

Reputation: 389

First of all, use HttpUtility.DecodeUrl. This function will decode HTML special chars, leaving a plain string to work with. You can then simply split by /.

Upvotes: 0

apocalypse
apocalypse

Reputation: 5884

Try this:

using System;
using System.Linq;
using System.Net;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main (string[] args)
        {
            var url = "http://example.com/1392/Music/1392/Shahrivar/21/Avicii%20-%20True/01.%20Avicii%20Ft.%20Aloe%20Blacc%20-%20Wake%20Me%20Up%20(CDQ)%20%5b320%5d.mp3";

            var uri = new Uri (url);

            string[] segments = uri.Segments.Select (x => WebUtility.UrlDecode (x).TrimEnd ('/')).ToArray ();
        }
    }
}

Upvotes: 1

djscheuf
djscheuf

Reputation: 16

If you know the structure of the URL you are scraping you should be able to break-off the useless part of the string.

For example, if you know that the URL follows the form: http://example.com/1392/Music/1392/Shahrivar/21/{Artist}-{Album}/{Track Information}

Roughly, I think the following would allow you to extract the information you want from a single link:

void Main (string[] args) 
{
  var example = @"http://example.com/1392/Music/1392/Shahrivar/21/Avicii%20-%20True/01.%20Avicii%20Ft.%20Aloe%20Blacc%20-%20Wake%20Me%20Up%20(CDQ)%20%5b320%5d.mp3";

    var parts = example.split('/');
    var album = parts[7];
    var trackInfo = parts[8];

    var trackParts = trackInfo.split('-');
    var artist = trackParts[0];
    var trackTitle = trackParts[1];

    Console.WriteLine(trackTitle);
}

Here I am splitting the URL by '/', which is a messy solution, but for a simple case, it works. Then I am finding the index within the tokenized string where the desired information can be found. once I have the track information, I know the convention is to separate the Artist from the Title by a '-', so I split again and then have both artist and title.

You can refactor this into a method which takes the URL, and returns an object containing the Artist and song title. After that, you might want to use a string.Replace on the '%20' with ' '.

Upvotes: 0

Related Questions