MLBdev
MLBdev

Reputation: 51

Take IP Addresses only from log file and save to File, Table, or .CSV

Take IP Addresses only from log file and save to File, Table, or .CSV

I have a log file with entries like so:

2010-09-13 00:00:01 69.143.116.98 - W3SVC2 STREAM 209.22.66.152 80 GET /p7pm/p7popmenu.js - 200 0 7700 379 188 .org Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.0;+WOW64;+GoogleT5;+SLCC1;+.NET+CLR+2.0.50727;+Media+Center+PC+5.0;+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30618;+.NET4.0C) - .org/
Mozilla/5.0+(compatible;+Yahoo!+Slurp/3.0;+.com) - waste.html
2010-09-13 08:52:15 67.195.112.157 - W3SVC2 STREAM 209.22.66.152 80 GET /includes/Center_nav_p4.css - 304 0 164 482 0 HTTP/1.0 LOL.org Mozilla/5.0+(compatible;+Yahoo!+Slurp/3.0;+.com) - waste.html

What I am thinking is the best way to extract the IP address of each log entry and save that as a line or row in a database. I would probably save into a List first or something like that and then into a db, csv, or text file with just the ip addreses.

Something like this:

"69.143.116.98" 
"65.37.53.228" 
"169.123.16.100" 
"169.123.16.12" 
"169.123.16.9" 
"169.123.6.89" 

It looks like the IP address begins on the 21st line so was thinking I can somehow start there but then to figure out how to get the rest of the IP. maybe something like start at the 21st and then grab all until I hit a space?

After I grab them all I will then count and sort them and save them to the final format.

Am I on the right path? Thanks.

Apparently I didn't get the whole task in here; it appears it is going to be a bit harder. It is a TON of sorting involved; I imagine the first part is grabing this data and put into some sort of table and then do al lthis sorting and finally write out the count, IP after sorting to csv........

I need to parse in this log file and here is what needs to happen; it is crazy:

1.) The code will count the number of requests made by the IP addresses contained in the log file.

2.) The code will only count GET requests made over the standard port used for HTTP and should exclude from the count all requests made from IP's beginning with '207.114'.

  1. )The the final CSV file should be ordered so that IPs that made the most requests are listed first.

  2. )IPs that made the same number of requests should be ordered amongst themselves with the IP octets of greater values listed first.

  3. )The first column should contain the number of requests and the second will contain the IP address that made them. SomeFromLog.csv - Example based on data below:

8, "69.143.116.98"

3, "65.37.53.228"

1, "169.123.16.100"

1, "169.123.16.12"

1, "169.123.16.9"

1, "169.123.6.89"

Upvotes: 1

Views: 1502

Answers (3)

Yanga
Yanga

Reputation: 3012

You can install TX.Windows from nugget: https://www.nuget.org/packages/Tx.Windows

PM > Install-Package Tx.Windows

And then use it like that:

        var iisLog = W3CEnumerable.FromFile(pathToLog);
        List<string> IpsLog = new List<string>();
        foreach (var item in iisLog)
        {
            IpsLog.Add(item.c_ip);
        }

If the log file is used by another process, you can use W3CEnumerable.FromStream

Upvotes: 2

Knight Fall
Knight Fall

Reputation: 49

Add the namespace,

System.Text.RegularExpressions

Then use Regular Expression

  string pattern = @"\b(?:(?:2(?:[0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9])\.){3}(?:(?:2([0-4][0-9]|5[0-5])|[0-1]?[0-9]?[0-9]))";
        Regex r = new Regex(pattern);
        string input =File.ReadAllText(path) ;
        MatchCollection matches = r.Matches(input);
        foreach (Match match in matches)
            Console.WriteLine(match.Value);

This way you will get all the IP addresses in an array. You can use regexr to check regular expressions: http://regexr.com/

Upvotes: 0

Tinwor
Tinwor

Reputation: 7973

string line = string.Empty;
using(StreamReader sr = new StreamReader("path/to/file")) {
    while((line = sr.ReadLine())!=null) {
        var matches = Regex.Match(line, @"^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$");
        if(matches == null) continue;
        foreach(var group in match.Group) {
            //Do your staff here
        }
    }
}

Using this Regex you will able to match only the valid IPs and if there is nothing to match it will continue the while cycle (according to the if statement)

Upvotes: 0

Related Questions