Reputation: 153
I have the code:
class Program {
static void Main(string[] args) {
const string f = "../../../input.txt";
List < string > lines = new List < string > ();
using(StreamReader r = new StreamReader(f)) {
string line;
while ((line = r.ReadLine()) != null) {
if (line.StartsWith(" <job_number") && line.EndsWith(">")) {
lines.Add(line);
}
}
}
foreach(string s in lines) {
Console.WriteLine(s);
}
Console.Read();
}
}
after the while loop, I run a condition to find any lines that start with some string and end with some string. This is how the string looks like:
<job_number "1234" />
<job_number "1829" />
How do I extract the numbers from inside the quote? At the moment the console prints out the whole line:
<job_number "1234" />
<job_number "1829" />
I want:
1234
1829
I've looked into Regex but it confuses me greatly.
Edit: I need to add that the file I am parsing is a systems configuration file which contains a lot of other data. I have managed to create a list called lines that gets the exact values that I need. I now need to throw some formatting in this list to get the values from the list (everything inside the quotes).
Upvotes: 2
Views: 213
Reputation: 12196
Using IndexOf
and Substring
will get the job done in a manner of speed and memory and simplicity(part).
if (line.StartsWith(" <job_number") && line.EndsWith(">")) {
int start = line.IndexOf("\"") + 1;
int end = line.IndexOf("\"", start);
if (start > 0 && end > 0)
{
string numberAsString = line.Substring(start, end - start);
int number;
if (int.TryParse(numberAsString, out number))
{
lines.Add(number);
//Console.WriteLine(number);
}
}
}
Upvotes: 0
Reputation: 11395
In your case, a simple regex match with \d+
will do the job.
//...
while ((line = r.ReadLine()) != null)
{
var re = Regex.Match(line, @"(\d+)");
if (re.Success)
{
var val = re.Groups[1].Value;
lines.Add(val);
}
}
//...
EDIT:
You can of course change the regex for your exact needs, for example:
var re = Regex.match(line, "job_number\\s\"(\\d+)\"");
might be more appropriate if your file contains other numbers as well.
Upvotes: 3
Reputation: 13676
If format of your string is invariable you can do it in one line with a simple Split
method :
string value = input.Split('"')[1];
For example :
string[] s =
{
@"<job_number ""1234"" />",
@"<job_number ""1829"" />"
};
for (int i = 0; i < s.Length; i++) Console.Write(s[i].Split('"')[1] +", ");
Output : 1234, 1829
Upvotes: 0
Reputation: 55581
The file you are parsing is practically XML Why not just go all the way and standardize the format to be xml complaint?
<Jobs>
<Job Number="1234" />
<Job Number="1235" />
</Jobs>
Then you could simply use Linq to XML to grab all the Job elements and enumerate their Number attribute.
XDocument doc = XDocument.Load("XMLFile1.xml");
var numbers = from t in doc.Descendants("Job")
select t.Attribute("Number").Value;
foreach (var number in numbers)
{
Console.WriteLine(number);
}
Upvotes: 0
Reputation: 11228
If you are keen on LINQ:
var str = @"<job_number ""1234"" />";
var num = new string(str.Where(c => Char.IsDigit(c)).ToArray());
Console.WriteLine(num); // 1234
Upvotes: 3