Reputation: 51
Take IP Addresses only from log file and save to File, Table, or .CSV
I have a log file with entries like so:
2010-09-13 00:00:01 69.143.116.98 - W3SVC2 STREAM 209.22.66.152 80 GET /p7pm/p7popmenu.js - 200 0 7700 379 188 .org Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+WOW64;+GoogleT5;+SLCC1;+.NET+CLR+2.0.50727;+Media+Center+PC+5.0;+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30618;+.NET4.0C) - .org/
Mozilla/5.0+(compatible;+Yahoo!+Slurp/3.0;+.com) - waste.html
2010-09-13 08:52:15 67.195.112.157 - W3SVC2 STREAM 209.22.66.152 80 GET /includes/Center_nav_p4.css - 304 0 164 482 0 HTTP/1.0 LOL.org Mozilla/5.0+(compatible;+Yahoo!+Slurp/3.0;+.com) - waste.html
What I am thinking is the best way to extract the IP address of each log entry and save that as a line or row in a database. I would probably save into a List first or something like that and then into a db, csv, or text file with just the ip addreses.
Something like this:
"69.143.116.98"
"65.37.53.228"
"169.123.16.100"
"169.123.16.12"
"169.123.16.9"
"169.123.6.89"
It looks like the IP address begins on the 21st line so was thinking I can somehow start there but then to figure out how to get the rest of the IP. maybe something like start at the 21st and then grab all until I hit a space?
After I grab them all I will then count and sort them and save them to the final format.
Am I on the right path? Thanks.
Apparently I didn't get the whole task in here; it appears it is going to be a bit harder. It is a TON of sorting involved; I imagine the first part is grabing this data and put into some sort of table and then do al lthis sorting and finally write out the count, IP after sorting to csv........
I need to parse in this log file and here is what needs to happen; it is crazy:
1.) The code will count the number of requests made by the IP addresses contained in the log file.
2.) The code will only count GET requests made over the standard port used for HTTP and should exclude from the count all requests made from IP's beginning with '207.114'.
)The the final CSV file should be ordered so that IPs that made the most requests are listed first.
)IPs that made the same number of requests should be ordered amongst themselves with the IP octets of greater values listed first.
)The first column should contain the number of requests and the second will contain the IP address that made them. SomeFromLog.csv - Example based on data below:
8, "69.143.116.98"
3, "65.37.53.228"
1, "169.123.16.100"
1, "169.123.16.12"
1, "169.123.16.9"
1, "169.123.6.89"
Upvotes: 1
Views: 1502
Reputation: 3012
You can install TX.Windows from nugget: https://www.nuget.org/packages/Tx.Windows
PM > Install-Package Tx.Windows
And then use it like that:
var iisLog = W3CEnumerable.FromFile(pathToLog);
List<string> IpsLog = new List<string>();
foreach (var item in iisLog)
{
IpsLog.Add(item.c_ip);
}
If the log file is used by another process, you can use W3CEnumerable.FromStream
Upvotes: 2
Reputation: 49
Add the namespace,
System.Text.RegularExpressions
Then use Regular Expression
string pattern = @"\b(?:(?:2(?:[0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9])\.){3}(?:(?:2([0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9]))";
Regex r = new Regex(pattern);
string input =File.ReadAllText(path) ;
MatchCollection matches = r.Matches(input);
foreach (Match match in matches)
Console.WriteLine(match.Value);
This way you will get all the IP addresses in an array. You can use regexr to check regular expressions: http://regexr.com/
Upvotes: 0
Reputation: 7973
string line = string.Empty;
using(StreamReader sr = new StreamReader("path/to/file")) {
while((line = sr.ReadLine())!=null) {
var matches = Regex.Match(line, @"^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$");
if(matches == null) continue;
foreach(var group in match.Group) {
//Do your staff here
}
}
}
Using this Regex
you will able to match only the valid IPs and if there is nothing to match it will continue the while
cycle (according to the if
statement)
Upvotes: 0