Reputation: 11619
I am trying to see if a given host name appears in a list of hosts in the form of comma separated string like the following:
String list = "aa.com,bb.com,cc.com,dd.net,ee.com,ff.net";
String host1 = "aa.com"; // should be a match
String host2 = "a.com"; // shouldn't be a match
String host3 = "ff.net" // should be a match
// here is a test for host1
if (list.matches(".*[,^]" + host1 + "[$,].*")) {
System.out.println(host1 + " matched");
}
else {
System.out.println(host1 + " not matched");
}
But I got not matched for host (aa.com) but then I am not very familiar with regex. Please correct me!
BTW I don't want to use a solution where you split the host list into an array and then doing matching there. It was too slow because the host list can be quite long. Regex apporoach can be even worse but I was trying to make it work first.
Upvotes: 1
Views: 871
Reputation: 61
I also think Regexes are too slow if you are looking for an exact match, so I tried to write a method that looks for occurences of the host name in the list and checks every substring whether it's not a part of a wider host name (like "a.com" is a part of "aa.com"). If it's not - the result is true, there is such a host in the list. Here's the code:
boolean containsHost(String list, String host) {
boolean result = false;
int i = -1;
while((i = list.indexOf(host, i + 1)) >= 0) { // while there is next match
if ((i == 0 || list.charAt(i - 1) == ',') // beginning of the list or has a comma right before it
&& (i == (list.length() - host.length()) // end of the list
|| list.charAt(i + host.length()) == ',')) { // or has a comma right after it
result = true;
break;
}
}
return result;
}
But then I thought that it would be even faster to check just 3 cases - matches in the beginning, in the middle and in the end of the list, which can be done with startsWith
, contains
and endsWith
methods respectively. Here's the second option, which I would prefer in your case:
boolean containsHostShort(String list, String host) {
return list.contains("," + host + ",") || list.startsWith(host + ",") || list.endsWith("," + host);
}
UPD: ZouZou's comment to your post also seems good, I would recommend to compare the speed on a list similar to the sizes you have in the real situation and choose the fastest one.
Upvotes: 1
Reputation: 233
Try this:
String list = "aa.com,bb.com,cc.com,dd.net,ee.com,ff.net";
String host1 = "aa.com"; // should be a match
String host2 = "a.com"; // shouldn't be a match
String host3 = "ff.net" // should be a match
//For host1
Pattern p1 = Pattern.compile("\\b[A-Za-z]{2}.com");
Matcher m1 = p1.matcher(list);
if(m1.find()){
System.out.println(host1 + " matched");
}else{
System.out.println(host1 + " not matched");
}
//for host2
p1 = Pattern.compile("\\b[A-Za-z]{1}.com");
m1 = p1.matcher(list);
if(m1.find()){
System.out.println(host2 + " matched");
}else{
System.out.println(host2+"Not mached");
}
//and so on...
The \b means word boundary (so start of word in this case). The [A-Za-z]{n}.com means a character between A-Z or a-z n times followed by a .com
Upvotes: 0
Reputation: 554
This works prefectly,without regex
String list = "aa.com,bb.com,cc.com,dd.net,ee.com,ff.net";
String host1 = "aa.com";
String host2 = "a.com";
String host3 = "ff.net";
boolean checkingFlag=false;
String [] arrayList=list.split(",");
System.out.println(arrayList.length);
for(int i=0;i<arrayList.length;i++)
{
// here is a test for host1
if (arrayList[i].equalsIgnoreCase(host1))
checkingFlag=true;
}
if (checkingFlag)
System.out.println("Matched");
else
System.out.println("Not matched");
It is hardly taken 20-30 millsecs to execute a loop with 1 million records.As per your comment i have just edited.you can check this.
long startingTime=System.currentTimeMillis();
for(int i=0;i<1000000;i++)
{
if (i==999999)
checkingFlag=true;
}
long endingTime=System.currentTimeMillis();
System.out.println("total time in millisecond:"+ (endingTime-startingTime));
Upvotes: 0
Reputation: 1155
You can use a lambda to stream the array and return a boolean
for the match.
String list = "aa.com,bb.com,cc.com,dd.net,ee.com,ff.net";
String host1 = "aa.com"; // should be a match
String host2 = "a.com"; // shouldn't be a match
String host3 = "ff.net"; // should be a match
ArrayList<String> alist = new ArrayList<String>();
for(String item : list.split("\\,"))
{
alist.add(item);
}
boolean contains_host1 = alist.stream().anyMatch(b -> b.equals(host1));
boolean contains_host2 = alist.stream().anyMatch(b -> b.equals(host2));
boolean contains_host3 = alist.stream().anyMatch(b -> b.equals(host3));
System.out.println(contains_host1);
System.out.println(contains_host2);
System.out.println(contains_host3);
Console output:
true
false
true
Upvotes: 0
Reputation: 2306
Like it is mentioned in the comments. You shouldn't be using Matches
as it tries to match the regex pattern to the entire comma delimited string. You are not trying to do that. You are trying to detect if a given substring occurs in the comma separated source string.
In order to do that you would just use the hostname in a findall
method. However, you can just use substring which would not have an overhead of regex compilation.
Regexes are used to match strings that could have variations in the pattern matched. Never use a regex when you want to do exact string matching.
Upvotes: 0