Reputation: 9189
I am reading a logfile into java. For each line in the logfile, I am checking to see if the line contains an ip address. If the line contains an ip address, I want to then +1 to the count of the number of times that ip address showed up in the log file. How can I accomplish this in Java?
The code below successfully extracts the ip address from each line that contains an ip address, but the process for counting occurrences of ip addresses does not work.
void read(String fileName) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(fileName)));
int counter = 0;
ArrayList<IPHolder> ips = new ArrayList<IPHolder>();
try {
String line;
while ((line = br.readLine()) != null) {
if(!getIP(line).equals("0.0.0.0")){
if(ips.size()==0){
IPHolder newIP = new IPHolder();
newIP.setIp(getIP(line));
newIP.setCount(0);
ips.add(newIP);
}
for(int j=0;j<ips.size();j++){
if(ips.get(j).getIp().equals(getIP(line))){
ips.get(j).setCount(ips.get(j).getCount()+1);
}else{
IPHolder newIP = new IPHolder();
newIP.setIp(getIP(line));
newIP.setCount(0);
ips.add(newIP);
}
}
if(counter % 1000 == 0){System.out.println(counter+", "+ips.size());}
counter+=1;
}
}
} finally {br.close();}
for(int k=0;k<ips.size();k++){
System.out.println("ip, count: "+ips.get(k).getIp()+" , "+ips.get(k).getCount());
}
}
public String getIP(String ipString){//extracts an ip from a string if the string contains an ip
String IPADDRESS_PATTERN =
"(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)";
Pattern pattern = Pattern.compile(IPADDRESS_PATTERN);
Matcher matcher = pattern.matcher(ipString);
if (matcher.find()) {
return matcher.group();
}
else{
return "0.0.0.0";
}
}
The holder class is:
public class IPHolder {
private String ip;
private int count;
public String getIp(){return ip;}
public void setIp(String i){ip=i;}
public int getCount(){return count;}
public void setCount(int ct){count=ct;}
}
Upvotes: 0
Views: 1510
Reputation: 319
The key word to search for is HashMap in this case. A HashMap is a list of key value pairs (in this case pairs of ips and their count).
"192.168.1.12" - 12
"192.168.1.13" - 17
"192.168.1.14" - 9
and so on. It is much easier to use and access than to always iterate over your array of container objects to find out whether there already is a container for that ip or not.
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(/*Your file */)));
HashMap<String, Integer> occurrences = new HashMap<String, Integer>();
String line = null;
while( (line = br.readLine()) != null) {
// Iterate over lines and search for ip address patterns
String[] addressesFoundInLine = ...;
for(String ip: addressesFoundInLine ) {
// Did you already have that address in your file earlier? If yes, increase its counter by
if(occurrences.containsKey(ip))
occurrences.put(ip, occurrences.get(ip)+1);
// If not, create a new entry for this address
else
occurrences.put(ip, 1);
}
}
// TreeMaps are automatically orered if their elements implement 'Comparable' which is the case for strings and integers
TreeMap<Integer, ArrayList<String>> turnedAround = new TreeMap<Integer, ArrayList<String>>();
Set<Entry<String, Integer>> es = occurrences.entrySet();
// Switch keys and values of HashMap and create a new TreeMap (in case there are two ips with the same count, add them to a list)
for(Entry<String, Integer> en: es) {
if(turnedAround.containsKey(en.getValue()))
turnedAround.get(en.getValue()).add((String) en.getKey());
else {
ArrayList<String> ips = new ArrayList<String>();
ips.add(en.getKey());
turnedAround.put(en.getValue(), ips);
}
}
// Print out the values (if there are two ips with the same counts they are printed out without an special order, that would require another sorting step)
for(Entry<Integer, ArrayList<String>> entry: turnedAround.entrySet()) {
for(String s: entry.getValue())
System.out.println(s + " - " + entry.getKey());
}
In my case the output was the following:
192.168.1.19 - 4
192.168.1.18 - 7
192.168.1.27 - 19
192.168.1.13 - 19
192.168.1.12 - 28
I answered this question about half an hour ago and I guess that is exactly what you are searching for, so if you need some example code, take a look at it.
Upvotes: 1
Reputation: 1338
Here is some code that uses a HashMap to store the IPs and a regex to match them in each line. It uses try-with-resources to automatically close the file.
EDIT: I added code to print in descending order like you asked in the other answer.
void read(String fileName) throws IOException {
//Step 1 find and register IPs and store their occurence counts
HashMap<String, Integer> ipAddressCounts = new HashMap<>();
try (BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(fileName)))) {
Pattern findIPAddrPattern = Pattern.compile("((\\d+.){3}\\d+)");
String line;
while ((line = br.readLine()) != null) {
Matcher matcher = findIPAddrPattern.matcher(line);
while (matcher.find()) {
String ipAddr = matcher.group(0);
if ( ipAddressCounts.get(ipAddr) == null ) {
ipAddressCounts.put(ipAddr, 1);
}
else {
ipAddressCounts.put(ipAddr, ipAddressCounts.get(ipAddr) + 1);
}
}
}
}
//Step 2 reverse the map to store IPs by their frequency
HashMap<Integer, HashSet<String>> countToAddrs = new HashMap<>();
for (Map.Entry<String, Integer> entry : ipAddressCounts.entrySet()) {
Integer count = entry.getValue();
if ( countToAddrs.get(count) == null )
countToAddrs.put(count, new HashSet<String>());
countToAddrs.get(count).add(entry.getKey());
}
//Step 3 sort and print the ip addreses, most frequent first
ArrayList<Integer> allCounts = new ArrayList<>(countToAddrs.keySet());
Collections.sort(allCounts, Collections.reverseOrder());
for (Integer count : allCounts) {
for (String ip : countToAddrs.get(count)) {
System.out.println("ip, count: " + ip + " , " + count);
}
}
}
Upvotes: 0