user4412878
user4412878

Reputation:

How can I parse a file name from string using IndexOf and Substring?

private void ParseFilesNames()
{
    using (WebClient client = new WebClient())
    {
        try
        {
            for (int i = 0; i < 15; i++)
            {
                string urltoparse = "mysite.com/gallery/albums/from_old_gallery/" + i;
                string s = client.DownloadString(urltoparse);
                int index = -1;
                while (true)
                {
                    string firstTag = "HREF=";
                    string secondtag = ">";
                    index = s.IndexOf(firstTag, 0);
                    int endIndex = s.IndexOf(secondtag, index);
                    if (index < 0)
                    {
                        break;
                    }
                    else
                    {
                        string filename = s.Substring(index + firstTag.Length, endIndex - index - firstTag.Length);
                    }
                }
            }
        }
        catch (Exception err)
        {
        }
    }
}

The problem is with the Substring. index + firstTag.Length, endIndex - index - firstTag.Length This is wrong.

What I need to get is the string between: HREF=" and ">

The whole string looks like: HREF="myimage.jpg"> I need to get only "myimage.jpg"

And sometimes it can be "myimage465454.jpg" so in any case I need to get only the file name. Only "myimage465454.jpg".

What should I change in the substring?

Upvotes: 1

Views: 737

Answers (2)

Alberto Montellano
Alberto Montellano

Reputation: 6286

If you are sure that your string will always be < HREF="yourpath" > , just apply the following:

string yourInitialString = @"HREF="myimage.jpg"";
string parsedString = yourInitialString.Replace(@"<HREF="").Replace(@"">");

If you need to parse HTML links href values, the best option will be using HtmlAgilityPack library.

Solution with Html Agility Pack :

HtmlWeb htmlWeb = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc =  htmlWeb.Load(Url);

foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))
{
    // Get the value of the HREF attribute
    string hrefValue = link.GetAttributeValue( "href", string.Empty );
}

To install HtmlAgilityPack, run the following command in the Package Manager Console:

 PM> Install-Package HtmlAgilityPack

Hope it helps.

Upvotes: 3

Titi Kokov
Titi Kokov

Reputation: 127

Try this:

String filename = input.split("=")[1].replace("\"","").replace(">","");

Upvotes: 0

Related Questions