Manidip Sengupta
Manidip Sengupta

Reputation: 3611

How to use back reference in Java regex

Question for the RE experts: Consider the following Perl script:

my @lines = (
        "Once upon a time in a galaxy far, far away, there lived\n",
        "this _idiot_ trying to _mark up_ a few lines of\n",
        "marked down text using yet another _language_.\n");

foreach (@lines) {
        s|_(.+?)_|<em>$1</em>|g;
        print
}

The output of % perl [aboveScript] is

Once upon a time in a galaxy far, far away, there lived
this <em>idiot</em> trying to <em>mark up</em> a few lines of
marked down text using yet another <em>language</em>.

I am trying to achieve this in Java. The class I have come up with follows. It works and I get the same output as above, but I am pretty sure this is not the way to do this. My question - how would you implement the "parseLine()" method?

import java.util.*;
import java.util.regex.*;

public class Reglob {

        private final static Pattern emPattern = Pattern.compile ("_(.+?)_");

        public void parseLine (String[] lines) {
                for (String line : lines) {
                        List<Integer>   bList = new ArrayList<Integer>(),
                                        eList = new ArrayList<Integer>();
                        Matcher m = emPattern.matcher (line);
                        int n = 0;
                        while (m.find()) {
                                // System.out.println ("Match indices: " + m.start() + ", " + m.end());
                                bList.add (m.start());
                                eList.add (m.end());
                                n++;
                        }
                        if (n == 0) {
                                System.out.println (line);
                        } else {
                                String s = line.substring (0, bList.get(0));
                                for (int i = 0 ; i < n-1 ; i++) {
                                    s += "<em>"
                                        + line.substring(1+bList.get(i),eList.get(i)-1)
                                        + "</em>" + line.substring (eList.get(i), bList.get(i+1));
                                }
                                s += "<em>"
                                        + line.substring(1+bList.get(n-1),eList.get(n-1)-1)
                                        + "</em>" + line.substring (eList.get(n-1), line.length());
                                System.out.println (s);
        }}}

        public static void main (String[] args) {
                String[] lines = {
                        "Once upon a time in a galaxy far, far away, there lived",
                        "this _idiot_ trying to _mark up_ a few lines of",
                        "marked down text using yet another _language_."};
                new Reglob().parseLine (lines);
}}

Upvotes: 0

Views: 509

Answers (4)

Avinash Raj
Avinash Raj

Reputation: 174706

You could simply do like this,

String [] s = {  "Once upon a time in a galaxy far, far away, there lived",
                "this _idiot_ trying to _mark up_ a few lines of",
                 "marked down text using yet another _language_."};
for(String s2 : s)
{
System.out.println(s2.replaceAll("_([^_]+)_", "<em>$1</em>"));
}

Upvotes: 1

som-snytt
som-snytt

Reputation: 39577

Similarly, in Scala:

scala> val text = """Once upon a time in a galaxy far, far away, there lived
     | this _idiot_ trying to _mark up_ a few lines of
     | marked down text using yet another _language_."""
text: String =
Once upon a time in a galaxy far, far away, there lived
this _idiot_ trying to _mark up_ a few lines of
marked down text using yet another _language_.

scala> val r = "_(.+?)_".r
r: scala.util.matching.Regex = _(.+?)_

scala> r.replaceAllIn(text, """<em>$1<\em>""")
res3: String =
Once upon a time in a galaxy far, far away, there lived
this <em>idiot<em> trying to <em>mark up<em> a few lines of
marked down text using yet another <em>language<em>.

I try everything out in the Scala REPL first, then convert to Java.

But since you have plenty of Java answers, I'll return to Hulu.

Upvotes: 0

Arek Woźniak
Arek Woźniak

Reputation: 695

Try something like this:

public static final String str = "Once upon a time in a galaxy far, far away, there lived\n" +
        "this _idiot_ trying to _mark up_ a few lines of\n" +
        "marked down text using yet another _language_.\n";

public static void main(String[] args) {
    System.out.println(str.replaceAll("_(.+?)_", "<em>$1</em>"));
}

The output for this:

Once upon a time in a galaxy far, far away, there lived
this <em>idiot</em> trying to <em>mark up</em> a few lines of
marked down text using yet another <em>language</em>.

Upvotes: 0

Robby Cornelissen
Robby Cornelissen

Reputation: 97150

This is the Java equivalent of your Perl script:

public class Main {
    public static void main(String[] args) {
        String[] lines = {
                "Once upon a time in a galaxy far, far away, there lived\n",
                "this _idiot_ trying to _mark up_ a few lines of\n",
                "marked down text using yet another _language_.\n" };

        for(String line : lines) {
            String output = line.replaceAll("_(.+?)_", "<em>$1</em>");

            System.out.print(output);
        }
    }
}

It outputs:

Once upon a time in a galaxy far, far away, there lived
this <em>idiot</em> trying to <em>mark up</em> a few lines of
marked down text using yet another <em>language</em>.

Upvotes: 2

Related Questions