Reputation: 259
I need to parse raw data and allow strings that can contain alphabets and ONLY one punctuation character.
Here is what I have done so far:
public class ProcessRawData {
public static void main(String[] args) {
String myData = "Australia India# America@!";
ProcessRawData data = new ProcessRawData();
data.process(myData);
}
public void process(String rawData) {
String[] splitData = rawData.split(" ");
for (String s : splitData) {
System.out.println("My Data Elements: " + s);
Pattern pattern = Pattern.compile("^[\\p{Alpha}\\p{Punct}]*$");
Matcher matcher = pattern.matcher(s);
if (matcher.matches()) {
System.out.println("Allowed");
} else {
System.out.println("Not allowed");
}
}
}
}
It prints below,
My Data Elements: Australia
Allowed
My Data Elements: India#
Allowed
My Data Elements: America@!
Allowed
Expected is it should NOT print America@! as it contains more than one punctuation character.
I guess I might need to use quantifiers, but not sure where to place them so that it will allow ONLY one punctuation character?
Can someone help?
Upvotes: 2
Views: 2348
Reputation: 1849
I hope that would be helpful.
public static void process(String rawData) {
String[] splitData = rawData.split(" ");
for (String s : splitData) {
Pattern pNum = Pattern.compile("[0-9]");
Matcher match = pNum.matcher(s);
if (match.find()) {
System.out.println(s + ": Not Allowed");
continue;
}
Pattern p = Pattern.compile("[^a-z]", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(s);
int count = 0;
while (m.find()) {
count = count + 1;
}
if (count > 1) {
System.out.println(s + ": Not Allowed");
} else {
System.out.println(s + ": Allowed");
}
}
}
Output
Australia: Allowed
India#: Allowed
America@!: Not Allowed
America1: Not Allowed
Upvotes: 0
Reputation: 626794
You may use
^\\p{Alpha}*(?:\\p{Punct}\\p{Alpha}*)?$
Explanation:
^
- start of string\\p{Alpha}*
- zero or more letters(?:\\p{Punct}\\p{Alpha}*)?
- one or zero (due to the ?
quantifier) sequences of:
\\p{Punct}
- a single occurrence of a punctuation symbol\\p{Alpha}*
- zero or more letters$
- end of string.Using it with String#matches
will allow dropping the ^
and $
anchors since the pattern will then be anchored by default:
if (input.matches("\\p{Alpha}*(?:\\p{Punct}\\p{Alpha}*)?")) { ... }
Upvotes: 1
Reputation: 425003
You can do it with a simple negative look-ahead:
((?!\\p{Punct}{2}).)*
So your code becomes simply:
public void process(String rawData) {
if (input.matches("((?!\\p{Punct}{2}).)*"))
System.out.println("Allowed");
} else {
System.out.println("Not allowed");
}
}
The regex just asserts that each character is not a {Punct}
followed by another {Punct}
.
Upvotes: 0
Reputation: 577
Alright! edit again
You can use following regex
^[A-Za-z]*[!"\#$%&'()*+,\-.\/:;<=>?@\[\\\]^_`{|}~]?[A-Za-z]*$
This will work for only one punctuation residing at any place.
Upvotes: -1
Reputation: 159086
You should compile your Pattern
outside the loop.
When using matches()
, there's no need for ^
and $
, since it'll match against the entire string anyway.
If you need at most one punctuation character, you need to match a single optional punctuation character, preceded and/or followed by optional alphabet characters.
Note that using \\p{Alpha}
and \\p{Punct}
excludes digits. No digit will be allowed. If you want to consider a digit as a special character, replace \\p{Punct}
with \\P{Alpha}
(uppercase P means not Alpha).
public static void main(String[] args) {
process("Australia India# Amer$ca America@! America1");
}
public static void process(String rawData) {
Pattern pattern = Pattern.compile("\\p{Alpha}*\\p{Punct}?\\p{Alpha}*");
for (String s : rawData.split(" ")) {
System.out.println("My Data Elements: " + s);
if (pattern.matcher(s).matches()) {
System.out.println("Allowed");
} else {
System.out.println("Not allowed");
}
}
}
Output
My Data Elements: Australia
Allowed
My Data Elements: India#
Allowed
My Data Elements: Amer$ca
Allowed
My Data Elements: America@!
Not allowed
My Data Elements: America1
Not allowed
Upvotes: 2