Reputation: 3
I have got a text like this in my String s (which I have already read from txt.file
)
trump;Donald Trump;[email protected]
obama;Barack Obama;[email protected]
bush;George Bush;[email protected]
clinton,Bill Clinton;[email protected]
Then I'm trying to cut off everything besides an e-mail address and print out on console
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
System.out.print(f1[i]);
}
and I have output like this:
[email protected]
[email protected]
[email protected]
[email protected]
How can I avoid such output, I mean how can I get output text without line breakers?
Upvotes: 0
Views: 75
Reputation: 15
Just replace '\n' that may arrive at start and end. write this way.
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
f1[i] = f1[i].replace("\n");
System.out.print(f1[i]);
}
Upvotes: 0
Reputation: 445
package com.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String s = "trump;Donald Trump;[email protected] "
+ "obama;Barack Obama;[email protected] "
+ "bush;George Bush;[email protected] "
+ "clinton;Bill Clinton;[email protected]";
String spaceStrings[] = s.split("[\\s,;]+");
String output="";
for(String word:spaceStrings){
if(validate(word)){
output+=word;
}
}
System.out.println(output);
}
public static final Pattern VALID_EMAIL_ADDRESS_REGEX = Pattern.compile(
"^[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,6}$",
Pattern.CASE_INSENSITIVE);
public static boolean validate(String emailStr) {
Matcher matcher = VALID_EMAIL_ADDRESS_REGEX.matcher(emailStr);
return matcher.find();
}
}
Upvotes: 0
Reputation: 164
You may just replace all line breakers as shown in the below code:
String f1[] = null;
f1=s.split("(.*?);");
for (int i=0;i<f1.length;i++) {
System.out.print(f1[i].replaceAll("\r", "").replaceAll("\n", ""));
}
This will replace all of them with no space.
Upvotes: 1
Reputation: 163577
Instead of split, you might match an email like format by matching not a semicolon or a whitespace character one or more times using a negated character class [^\\s;]+
followed by an @ and again matching not a semicolon or a whitespace character.
final String regex = "[^\\s;]+@[^\\s;]+";
final String string = "trump;Donald Trump;[email protected] \n"
+ " obama;Barack Obama;[email protected] \n"
+ " bush;George Bush;[email protected] \n"
+ " clinton,Bill Clinton;[email protected]";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
final List<String> matches = new ArrayList<String>();
while (matcher.find()) {
matches.add(matcher.group());
}
System.out.println(String.join("", matches));
[^\\s;]+@[^\\s;]+
Upvotes: 0
Reputation: 3600
Try using below approach. I have read your file with Scanner
as well as BufferedReader
and in both cases, I don't get any line break. file.txt
is the file that contains text and the logic of splitting remains the same as you did
public class CC {
public static void main(String[] args) throws IOException {
Scanner scan = new Scanner(new File("file.txt"));
while (scan.hasNext()) {
String f1[] = null;
f1 = scan.nextLine().split("(.*?);");
for (int i = 0; i < f1.length; i++) {
System.out.print(f1[i]);
}
}
scan.close();
BufferedReader br = new BufferedReader(new FileReader(new File("file.txt")));
String str = null;
while ((str = br.readLine()) != null) {
String f1[] = null;
f1 = str.split("(.*?);");
for (int i = 0; i < f1.length; i++) {
System.out.print(f1[i]);
}
}
br.close();
}
}
Upvotes: 1