Reputation: 6874
I have a series of text reports with fields like
"Contractile Front velocity"
on them
Some of them have "Contractile Front velocitycms"
on them instead. There are other terms similar to this where characters like cms have been added.
Each term has a numerical result associated with it and I am trying to put the term and the result into a database. The database field will be (for this example) "Contractile Front velocitycms"
So I would like to convert any report (plain text) field that does not have cms associated with it, to Contractile Front velocitycms
.
Because I have a load of find a replace problems to solve I created a method that uses StringUtils.replaceEach so that I can use a simple colon separated text file as a lookup dictionary to do the find and replace.
public static String FindNReplace(String n) throws IOException{
String [] split = null;
ArrayList<String> orig = new ArrayList<String>();
String [] orig_arr = null;
ArrayList<String> newDoc = new ArrayList<String>();
String [] newDoc_arr = null;
String dictionary="/Users/sebastianzeki/Documents/workspace/PhysiologyUpperGITotalExtractorv2/src/Overview/FindNReplaceDictionary.txt";
BufferedReader br = new BufferedReader(new FileReader(dictionary));
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
split=line.split(":");
System.out.println(split);
orig.add(split[1]);
newDoc.add(split[0]);
sb.append(line);
sb.append("\n");
line = br.readLine();
}
} finally {
br.close();
}
orig_arr = new String[orig.size()];
orig_arr = orig.toArray(orig_arr);
newDoc_arr = new String[newDoc.size()];
newDoc_arr = newDoc.toArray(newDoc_arr);
String replacer = StringUtils.replaceEach(n, orig_arr, newDoc_arr);
return replacer;
}
The dictionary looks like this
PostPr :Post-Prandial
PostPr :Post-prandial
Nausea :nausea
The problem is that if I just use my dictionary to replace Contractile Front velocity
with Contractile Front velocitycms
then occasionally, where Contractile Front velocitycms already exists I will get Contractile Front velocitycmscms
and the replaceEach
does not use regex. Can anyone think of a solution to avoid me getting the duplicates mentioned
Upvotes: 2
Views: 81
Reputation: 14810
What you want is Negative Lookahead to exclude the trailing part.
Negative lookahead is written as (?!pattern)
so in your case you want Contractile Front velocity(?!cms)
as your pattern to match.
You can try this on RegexPlanet ...
I used:
Regular expression: Contractile Front velocity(?!cms)
Input 1: This Contractile Front velocitycms already has it.
Input 2: But this Contractile Front velocity does not.
You'll see when you hit the Test button that Input 2 gets the "cms" added to it but Input 1 does not get it doubled.
Upvotes: 1