Reputation: 1304
I need to create a newsletters by URL. To do that, I:
WebClient
.DownloadData
to get a source of page in byte array;However, I have some troubles with paths. All elements' sources were relative (/img/welcome.png
) but I need an absolute one, like http://www.example.com/img/welcome.png.
How can I do this?
Upvotes: 7
Views: 4813
Reputation: 1304
One of the possible ways to resolve this task is the use the HtmlAgilityPack library.
Some example (fix links):
WebClient client = new WebClient();
byte[] requestHTML = client.DownloadData(sourceUrl);
string sourceHTML = new UTF8Encoding().GetString(requestHTML);
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(sourceHTML);
foreach (HtmlNode link in htmlDoc.DocumentNode.SelectNodes("//a[@href]"))
{
if (!string.IsNullOrEmpty(link.Attributes["href"].Value))
{
HtmlAttribute att = link.Attributes["href"];
att.Value = this.AbsoluteUrlByRelative(att.Value);
}
}
Upvotes: 6
Reputation: 11
Instead of resolving/completing relative paths, you can try to set the base-element with the href-attrib = the original baseURI in question.
Placed as the first child of the header-element, all following relative paths should be resolved by browser to point to the original destination, not to where the doc (newsletter) is located/comes from.
on firefox, some tautologic(<-in formal logics) to-and-fro of getting/setting of all src/href-attribs resumes in having COMPLETE paths written to all layers(serialized) of the html-doc, thus scriptable, saveable ...:
var d=document;
var n= d.querySelectorAll('[src]'); // do the same for [href] ...
var i=0; var op ="";var ops="";
for (i=0;i<n.length;i++){op = op + n[i].src + "\n";ops=n[i].src;
n[i].src=ops;}
alert(op);
Of course, the url()-func bases as given in the STYLE-Element(s, - for background-img or content-rules) as well as in style-attrib's at node-level and in particular the url()-func-stated src/href-values are NOT regarded/tested by any of the solutions above.
Therefore, to get the base-Elem approach to a valid, tested (compat-list) state, seems the more promising notion to me.
Upvotes: 0
Reputation: 5959
Just use this function
'# converts relative URL ro Absolute URI
Function RelativeToAbsoluteUrl(ByVal baseURI As Uri, ByVal RelativeUrl As String) As Uri
' get action tags, relative or absolute
Dim uriReturn As Uri = New Uri(RelativeUrl, UriKind.RelativeOrAbsolute)
' Make it absolute if it's relative
If Not uriReturn.IsAbsoluteUri Then
Dim baseUrl As Uri = baseURI
uriReturn = New Uri(baseUrl, uriReturn)
End If
Return uriReturn
End Function
Upvotes: 0
Reputation: 12221
if the request comes in from your site (same domain links) then you can use this:
new Uri(Request.Uri, "/img/welcome.png").ToString();
If you're in a non-web app, or you want to hardcode the domain name:
new Uri("http://www.mysite.com", "/img/welcome.png").ToString();
Upvotes: 2
Reputation: 18780
You have some options:
Console.Write(ControlChars.Cr + "Please enter a Url(for example, http://www.msn.com): ") Dim remoteUrl As String = Console.ReadLine() Dim myWebClient As New WebClient() Console.WriteLine(("Downloading " + remoteUrl)) Dim myDatabuffer As Byte() = myWebClient.DownloadData(remoteUrl) Dim download As String = Encoding.ASCII.GetString(myDataBuffer) download.Replace("src=""/", "src=""" & remoteUrl & "/") download.Replace("href=""/", "href=""" & remoteUrl & "/") Console.WriteLine(download) Console.WriteLine("Download successful.")
This is super contrived and actually the main brunt of it is taken directly from : http://msdn.microsoft.com/en-us/library/xz398a3f.aspx but it illustrates the basic principal behind method 1.
Upvotes: 0