Reputation: 861
How do you search a websites source code with C#? hard to explain, heres the source for doing it in python
import urllib2, re
word = "How to ask"
source = urllib2.urlopen("http://stackoverflow.com").read()
if re.search(word,source):
print "Found it "+word
Upvotes: 4
Views: 15418
Reputation: 15916
If you want to access the raw HTML from a web page you need to do the following:
So code something like:
string pageContent = null;
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create("http://example.com/page.html");
HttpWebResponse myres = (HttpWebResponse)myReq.GetResponse();
using (StreamReader sr = new StreamReader(myres.GetResponseStream()))
{
pageContent = sr.ReadToEnd();
}
if (pageContent.Contains("YourSearchWord"))
{
//Found It
}
Upvotes: 8
Reputation: 6450
I guess this is as close as you'll get in C# to your python code.
using System;
using System.Net;
class Program
{
static void Main()
{
string word = "How to ask";
string source = (new WebClient()).DownloadString("http://stackoverflow.com/");
if(source.Contains(word))
Console.WriteLine("Found it " + word);
}
}
I'm not sure if re.search(#, #) is case sensitive or not. If it's not you could use...
if(source.IndexOf(word, StringComparison.InvariantCultureIgnoreCase) > -1)
instead.
Upvotes: 2
Reputation: 48088
Here is the source for getting HTML code of a page, you can add your search method later :
string url = "http://someurl.com/default.aspx";
WebRequest webRequest=WebRequest.Create(url);
WebResponse response=webRequest.GetResponse();
Stream str=response.GetResponseStream();
StreamReader reader=new StreamReader(str);
string source=reader.ReadToEnd();
Hope this helps.
Upvotes: 0