Reputation: 30646
I have this string:
1001,"Fitzsimmons, Des Marteau, Beale and Nunn",109,"George","COD","Standard",,109,8/14/1998 8:50:02
What regular expression would I use to replace the commas in the
"Fitzsimmons, Des Marteau, Beale and Nunn"
with a pipe |
so it is:
"Fitzsimmons| Des Marteau| Beale and Nunn"
Should have clarified, I am doing a split on this string using the commas, so I want "Fitzsimmons, Des Marteau, Beale and Nunn"
to be a string. I plan to replace the |
with a comma after I have split it.
Upvotes: 1
Views: 13600
Reputation: 134
Hey Brandon you can easily do this with RE by using look behind and look ahead. see the code below
String cvsString = "1001,\"Fitzsimmons, Des Marteau, Beale and Nunn\",109,\"George\",\"COD\",\"Standard\",,109,8/14/1998 8:50:02";
String rePattern = "(?<=\")([^\"]+?),([^\"]+?)(?=\")";
// first replace
String oldString = cvsString;
String resultString = cvsString.replaceAll(rePattern, "$1|$2");
// additional repalces until until no more changes
while (!resultString.equalsIgnoreCase(oldString)){
oldString = resultString;
resultString = resultString.replaceAll(rePattern, "$1|$2");
}
result string will be
1001,"Fitzsimmons| Des Marteau| Beale and Nunn",109,"George","COD","Standard",,109,8/14/1998 8:50:02
NingZhang.info
Upvotes: 3
Reputation: 885
Here's a bit of Python that seems to do the trick:
>>> import re
>>> p = re.compile('["][^"]*["]|[^,]*')
>>> x = """1001,"Fitzsimmons, Des Marteau, Beale and Nunn",109,"George","COD","Standard",,109,8/14/1998 8:50:02"""
>>> y = p.findall(x)
>>> ','.join(z.replace(',','|') for z in y if z)
'1001,"Fitzsimmons| Des Marteau| Beale and Nunn",109,"George","COD","Standard",109,8/14/1998 8:50:02'
Seems like this code turn into a code golf question :-)
Oops...missed the Java tag.
Upvotes: 2
Reputation: 58770
Well, this is a CSV file, so I'd use Ruby's built-in CSV library. Then you don't have to figure out how to deal with escaped quotation marks, for example.
require 'csv'
string =<<CSV
1001,"Fitzsimmons, Des Marteau, Beale and Nunn",109,"George","COD","Standard",,109,8/14/1998 8:50:02
CSV
csv=CSV.parse string
csv.each{|row| row.each {|cell| cell.gsub!(",","|") if cell.is_a?(String)}}
outstring = ""
CSV::Writer.generate(outstring){|out| csv.each {|row| out<<row}}
Upvotes: 1
Reputation: 11220
I have tried to use StringTokenizer but it didn't work well, so here is a code which seems to do what you want:
import java.util.*;
public class JTest
{
public static void main(String[] args)
{
String str = "1001,\"Fitzsimmons, Des Marteau, Beale and Nunn\",109,\"George\",\"COD\",\"Standard\",,109,8/14/1998 8:50:02";
String copy = new String();
boolean inQuotes = false;
for(int i=0; i<str.length(); ++i)
{
if (str.charAt(i)=='"')
inQuotes = !inQuotes;
if (str.charAt(i)==',' && inQuotes)
copy += '|';
else
copy += str.charAt(i);
}
System.out.println(str);
System.out.println(copy);
}
}
Upvotes: 4
Reputation: 22220
I believe this is going to be very difficult to do with a regular expression. The trouble is that the regular expression would have to count quotes to determine if it's inside two quotes or not.
Actually, the .NET regex engine could do it with its balanced matching feature. But I don't think Java has that feature and I can't think of a reliable way to do it without it.
You may have to write some procedural code to accomplish this.
Upvotes: 1