Reputation: 533
I've got the following three lines of code, with html
being an html page stored as a string.
int startIndex = html.IndexOf("<title>") + 8; // <title> plus a space equals 8 characters
int endIndex = html.IndexOf("</title>") - 18; // -18 is because of the input, there are 18 extra characters after the username.
result = new Tuple<string, bool>(html.Substring(startIndex, endIndex), false);
With the input <title>Username012345678912141618</title>
I would expect an output of Username
. However, the code can't find the </title>
. I'm not sure what's going wrong. Does anyone know what could cause this behaviour?
I've tested it with three different webpages (all from the same site), of which I inspected the content.
Upvotes: 1
Views: 137
Reputation: 4848
I realize the OP was inquiring about the IndexOf
method, but here is a solution that uses a different approach--Regular Expressions, which are perfectly suited "surgically" extract data from strings.
The following pattern is all that is needed to extract the "Username" from the html tag:
var pattern = $@"<title>Username(.+)</title>";
This pattern would be used as follows:
var pattern = $@"<title>Username(.+)</title>";
var ms = Regex.Match(html, pattern, RegexOptions.IgnoreCase);
var userName = ms.Groups.Count > 0 ? ms.Groups[1].Value : string.Empty;
One advantage of Regex
is that you can use the exact text that you are using to search for the data you need. No need to fumble around with adding or subtracting "places" from the index.
You will need to add:
using System.Text.RegularExpressions;
to the class you intend to implement Regex
.
Upvotes: 0
Reputation: 141565
String.Substring
with 2 parameters has next signature - String.Substring(int startIndex, int length)
with second parameter being the number of characters in the substring. So you need to do something like this (taking in account your comment):
int startIndex = html.IndexOf("<title>") + 8;
int endIndex = html.IndexOf("</title>")
var result = new Tuple<string, bool>(html.Substring(startIndex, endIndex - startIndex - 18), false);
Upvotes: 3