Reputation: 63
this my code so far
foreach (var listBoxItem in listBox_google_urls.Items)
{
var document = new HtmlWeb().Load(listBoxItem.ToString());
var files = document.DocumentNode.Descendants("a").Select(a => a.GetAttributeValue("href", ".mp3")).Where(h => h.Contains(".mp3")).ToArray();
listbox_urls.Items.AddRange(files);
}
and this where come listBox_google_urls.Items
web_search.Navigate("https://www.google.com/search?q=" + val + "+(mp3|wav|ac3|ogg|flac|wma|m4a) -inurl:(jsp|pl|php|html|aspx|htm|cf|shtml) intitle:index.of -inurl:(listen77|mp3raid|mp3toss|mp3drug|index_of|wallywashis)");
var search_results = this.web_search.Document.Links.Cast<HtmlElement>().Select(a => a.GetAttribute("href")).Where(h => h.Contains("http://")).ToArray();
listBox_google_urls.Items.AddRange(search_results);
listBoxItem.ToString() output example
the problem is this méthode work but only scrab titles of links only they are way how i can fix it ?? and thanks already
Upvotes: 1
Views: 125
Reputation: 507
your code looks good, just not sure why you are defaulting to ".mp3" and then returning all that have ".mp3" ? you gonna end up with a collection of valid .mp3 URL's and then a whole bunch of ".mp3" strings? I just hoocked into a rando google search page and looked for all url's with the word "mail" in the href attribute, here are the results
Hope this answers your question. If you can give me some more info, maybe I could help a little more
Try this
var document = new HtmlWeb().Load("http://s1.mymrmusic2.com/hmusic/Album/Foreign%20Albums/VA%20-%20Billboard%20Hot%20100%20(02%20April%202016)/VA%20-%20Billboard%20Hot%20100%20(02%20April%202016)%20%5B320%5D/");
var files = document.DocumentNode.Descendants("a")
.Where(a => !string.IsNullOrEmpty(a.GetAttributeValue("href", string.Empty)) && a.GetAttributeValue("href", string.Empty).Contains(".mp3"))
.Select(a => new
{
Link = a.GetAttributeValue("href", string.Empty),
Text = a.FirstChild.InnerText
}).ToList();
Maybe try this option
foreach (var listBoxItem in listBox_google_urls.Items)
{
var document = new HtmlWeb().Load(listBoxItem.ToString());
var files = document.DocumentNode.Descendants("a")
.Select(a => a.GetAttributeValue("href", ".mp3"))
.Where(h => h.Contains(".mp3"))
.Select(a => listBoxItem.ToString() + a).ToArray();
listbox_urls.Items.AddRange(files);
}
Upvotes: 1