nguyen dung
nguyen dung

Reputation: 21

Special text processing in Java

I have a document has text in raw format, example :

(11) test(1/2/3) for 11 (15) test(1/2/3) for 15 (21) test(1/2/3) for 21
(22) test(1/2/3) 
for 22
(30) test(1/2/3) for 30 (43) test(1/2/3) for 43
(45) test(1/2/3) 
for 45
(51) test(1/2/3) for 51 (54) test(1/2/3) for 54
(57) test(1/2/3) for 57
(62) test(1/2/3) for 62 (67) test(1/2/3) for 67
(71) test(1/2/3) for 71
(72) test(1/2/3) for 72 (73) test(1/2/3) for 73
(74) test(1/2/3) for 74
(75) test(1/2/3) for 75 (76) test(1/2/3) for 76
(85) test(1/2/3) for 85 (86) test(1/2/3) for 86
(87) test(1/2/3) for 87

I want to extract it to object like below :

String s11 = test(1/2/3) for 11;
String s15 = test(1/2/3) for 15;
String s21 = test(1/2/3) for 21;
String s22 = test(1/2/3) for 22;
String s30 = test(1/2/3) for 30;
String s43 = test(1/2/3) for 43;
String s45 = test(1/2/3) for 45;
String s51 = test(1/2/3) for 51;
String s54 = test(1/2/3) for 54;
String s57 = test(1/2/3) for 57;
String s62 = test(1/2/3) for 62;
String s67 = test(1/2/3) for 67;
String s71 = test(1/2/3) for 71;
String s72 = test(1/2/3) for 72;
String s73 = test(1/2/3) for 73;
String s74 = test(1/2/3) for 74;
String s75 = test(1/2/3) for 75;
String s76 = test(1/2/3) for 76;
String s85 = test(1/2/3) for 85;
String s86 = test(1/2/3) for 86;
String s87 = test(1/2/3) for 87;

Could anyone give me a hint about how to do that in Java way ?

Upvotes: 1

Views: 83

Answers (3)

Ridwan Bhugaloo
Ridwan Bhugaloo

Reputation: 241

Suppose your text document is like this:

(11) test(1/2/3) for 11 
(15) test(1/2/3) for 15 
(21) test(1/2/3) for 21
(22) test(1/2/3) for 22
(30) test(1/2/3) for 30 
(43) test(1/2/3) for 43
(45) test(1/2/3) for 45
(51) test(1/2/3) for 51 
(54) test(1/2/3) for 54
(57) test(1/2/3) for 57
(62) test(1/2/3) for 62 
(67) test(1/2/3) for 67
(71) test(1/2/3) for 71
(72) test(1/2/3) for 72 
(73) test(1/2/3) for 73
(74) test(1/2/3) for 74
(75) test(1/2/3) for 75 
(76) test(1/2/3) for 76
(85) test(1/2/3) for 85 
(86) test(1/2/3) for 86
(87) test(1/2/3) for 87

Then you can do this:

String filepath = "file.txt";
File file = new File(filepath);
Scanner sc = new Scanner(file);
String pattern = "\\(\\d+\\) test\\(.*?\\) for \\d+";
String input = sc.findInLine(pattern);
while(sc.hasNextLine()){
    Pattern r = Pattern.compile(pattern);
    Matcher m = r.matcher(input);
    List<String> lines = new ArrayList<>();
    while (m.find()) {
        lines.add(m.group(0));
        //System.out.println(m.group(0));
        String[] split = m.group(0).split(" ");
        split[0] = split[0].replaceAll("\\p{P}","");
        System.out.println("String s"+split[0]+" = "+split[1] +" "+split[2]+" "+ split[3] );
    }
    sc.nextLine();
    input = sc.findInLine(pattern);
}
sc.close();

The output:

String s11 = test(1/2/3) for 11
String s15 = test(1/2/3) for 15
String s21 = test(1/2/3) for 21
String s22 = test(1/2/3) for 22
String s30 = test(1/2/3) for 30
String s43 = test(1/2/3) for 43
String s45 = test(1/2/3) for 45
String s51 = test(1/2/3) for 51
String s54 = test(1/2/3) for 54
String s57 = test(1/2/3) for 57
String s62 = test(1/2/3) for 62
String s67 = test(1/2/3) for 67
String s71 = test(1/2/3) for 71
String s72 = test(1/2/3) for 72
String s73 = test(1/2/3) for 73
String s74 = test(1/2/3) for 74
String s75 = test(1/2/3) for 75
String s76 = test(1/2/3) for 76
String s85 = test(1/2/3) for 85
String s86 = test(1/2/3) for 86
String s87 = test(1/2/3) for 87

Upvotes: 0

Ridwan Bhugaloo
Ridwan Bhugaloo

Reputation: 241

String input = "(11) test(1/2/3) for 11 (15) test(1/2/3) for 15 (21) test(1/2/3) for 21";
String pattern = "\\(\\d+\\) test\\(.*?\\) for \\d+";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
List<String> lines = new ArrayList<>();
while (m.find()) {
    lines.add(m.group(0));
    String[] split = m.group(0).split(" ");
    split[0] = split[0].replaceAll("\\p{P}","");
    System.out.println("String s"+split[0]+" = "+split[1] +" "+split[2]+" "+ split[3] );
}

The output:

String s11 = test(1/2/3) for 11
String s15 = test(1/2/3) for 15
String s21 = test(1/2/3) for 21

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522762

Assuming you can tolerate reading the entire file into a single Java string, then Java's regex engine has a clean way of handling this:

String input = "(11) test(1/2/3) for 11 (15) test(1/2/3) for 15 (21) test(1/2/3) for 21";
String pattern = "\\(\\d+\\) test\\(.*?\\) for \\d+";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
List<String> lines = new ArrayList<>();
while (m.find()) {
    lines.add(m.group(0));
    System.out.println(m.group(0));
}

This prints:

(11) test(1/2/3) for 11
(15) test(1/2/3) for 15
(21) test(1/2/3) for 21

Note that typically you would not want to create separate string instances for each match. Rather, you would just add all matches to a collection, or process them one-by-one on the fly as you match them.

Upvotes: 3

Related Questions