Reputation: 151
I have got a text file containing reference, name, address, amount, dateTo, dateFrom and mandatory columns, in the following format:
"120030125 J Blog 23, SOME HOUSE, 259.44 21-OCT-2013 17-NOV-2013"
" SQUARE, STREET, LEICESTER,"
LE1 2BB
"120030318 R Mxx 37, WOOD CLOSE, BIRMINGHAM, 121.96 16-OCT-2013 17-NOV-2013 Y"
" STREET, NN18 8DF"
"120012174 JE xx 25, SOME HOUSE, QUEENS 259.44 21-OCT-2013 17-NOV-2013"
" SQUARE, STREET, LEICESTER,"
LE1 2BB
"100154992 DL x 23, SOME HOUSE, QUEENS 270.44 21-OCT-2013 17-NOV-2013 Y"
" SQUARE, STREET, LEICESTER,"
LE1 2BC
I am only interested in the first lines of each string and want to extract the data in the reference, name, amount, dateTo and dateFrom columns and want to write them into a CSV file. Currently I've only been able to write the following code and extract the first lines and get rid of the starting and ending double quotes. The input file contains white spaces and so does the output file.
public class ReadTxt {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("C:/Users/me/Desktop/input.txt"));
String pattern = "\"\\d\\d\\d\\d";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
int i;
ArrayList<String> list = new ArrayList<String>();
boolean a = true;
PrintWriter out = new PrintWriter(new PrintWriter("C:/Users/me/Desktop/Output.txt"), a);
try {
String line = br.readLine();
while (line != null) {
Matcher m = r.matcher(line);
if (m.find()) {
String temp;
temp = line.substring(1, line.length() - 1);
list.add(temp);
}
else {
// do nothing
}
line = br.readLine();
}
}
finally {
br.close();
}
for (i = 0; i < list.size(); i++) {
out.println(list.get(i));
}
out.flush();
out.close();
}
}
The above code will create a text file with the following output:
120030125 J Blog 23, SOME HOUSE, QUEENS 259.44 21-OCT-2013 17-NOV-2013
120030318 R Mxx 37, WOOD CLOSE, BIRMINGHAM, 121.96 16-OCT-2013 17-NOV-2013 Y
120012174 JE xx 25, SOME HOUSE, QUEENS 259.44 21-OCT-2013 17-NOV-2013
100154992 DL x 23, SOME HOUSE, QUEENS 259.44 21-OCT-2013 17-NOV-2013 Y
My expected output is as following, but into a csv file:
120030125 J Blog 259.44 21-OCT-2013 17-NOV-2013
120030318 R Mxx 121.96 16-OCT-2013 17-NOV-2013
120012174 JE xx 259.44 21-OCT-2013 17-NOV-2013
100154992 DL x 259.44 21-OCT-2013 17-NOV-2013
Any suggestions, links to tutorials or help would be greatly appreciated as I am not an expert in Java. I did try looking up for tutorials on the internet, but could not find any which was useful in my case.
Upvotes: 1
Views: 6374
Reputation: 3641
public static void main (String[] args) throws IOException {
BufferedReader br = new BufferedReader (new FileReader ("D:/input.txt"));
String pattern = "\"\\d\\d\\d\\d";
// Create a Pattern object
Pattern r = Pattern.compile (pattern);
int i;
ArrayList<String> list = new ArrayList<String> ();
boolean a = true;
PrintWriter out = new PrintWriter (new PrintWriter ("D:/Output.csv"), a);
try {
String line = br.readLine ();
line= line.trim ();
while (line != null) {
Matcher m = r.matcher (line);
if (m.find ()) {
String temp;
temp = line.substring (0, 19) + " "
+ line.substring (51, line.length () - 1);
temp = temp.replaceAll ("[ ]+", " ").replace ("\"", "");
String[] array = temp.split ("[ ]");
temp = array[0] +","+ array[1] +" "+ array[2]+","+ array[3]+","+ array[4]+","+ array[5];
list.add (temp);
} else {
// do nothing
}
line = br.readLine ();
}
} finally {
br.close ();
}
for (i = 0; i < list.size (); i++) {
out.println (list.get (i));
}
out.flush ();
out.close ();
}
OUTPUT
120030125,J Blog,259.44,21-OCT-2013,17-NOV-2013
120030318,R Mxx,121.96,16-OCT-2013,17-NOV-2013
120012174,JE xx,259.44,21-OCT-2013,17-NOV-2013
100154992,DL x,270.44,21-OCT-2013,17-NOV-2013
Upvotes: 1
Reputation: 208964
Here, test this out. I just used a array, but you can implement the necessary code into yours. I changed some addresses (look at 2nd and 3rd address in the array) to have spaces and no spaces in different locations to test.
public class SplitData {
public static void main(String[] args) {
String[] array = {"120030125 J Blog 23, SOME HOUSE, QUEENS 259.44 21-OCT-2013 17-NOV-2013",
"120030318 R Mxx 37,WOODCLOSE,BIRMINGHAM, 121.96 16-OCT-2013 17-NOV-2013 Y 0",
"120012174 JE xx 25, SOME HOUSE,QUEENS 259.44 21-OCT-2013 17-NOV-2013",
"100154992 DL x 23, SOME HOUSE, QUEENS 259.44 21-OCT-2013 17-NOV-2013 Y"
};
String s1 = null;
String s2 = null;
String s3 = null;
String s4 = null;
String s5 = null;
for (String s : array) {
String[] split = s.split("\\s+");
s1 = split[0];
s2 = split[1] + " " + split[2];
for (String string: split) {
if (string.matches("\\d+\\.\\d{2}")) {
s3 = string;
break;
}
}
String[] newArray = s.substring(s.indexOf(s3)).split("\\s+");
s4 = newArray[1];
s5 = newArray[2];
System.out.printf("%s\t%s\t%s\t%s\t%s\n", s1, s2, s3, s4, s5);
}
}
}
Output
120030125 J Blog 259.44 21-OCT-2013 17-NOV-2013
120030318 R Mxx 121.96 16-OCT-2013 17-NOV-2013
120012174 JE xx 259.44 21-OCT-2013 17-NOV-2013
100154992 DL x 259.44 21-OCT-2013 17-NOV-2013
Upvotes: 1