Luiz
Luiz

Reputation: 151

How to extract data from a text file and write into CSV file in Java

I have got a text file containing reference, name, address, amount, dateTo, dateFrom and mandatory columns, in the following format:

"120030125 J Blog  23, SOME HOUSE,                 259.44  21-OCT-2013  17-NOV-2013"
"                  SQUARE, STREET, LEICESTER,"
                   LE1 2BB

"120030318 R Mxx   37, WOOD CLOSE, BIRMINGHAM,     121.96  16-OCT-2013  17-NOV-2013  Y"                      
"                  STREET, NN18 8DF"

"120012174 JE xx   25, SOME HOUSE, QUEENS          259.44  21-OCT-2013  17-NOV-2013"
"                  SQUARE, STREET, LEICESTER,"
                   LE1 2BB

"100154992 DL x    23, SOME HOUSE, QUEENS          270.44  21-OCT-2013  17-NOV-2013  Y"             
"                  SQUARE, STREET, LEICESTER,"
                   LE1 2BC

I am only interested in the first lines of each string and want to extract the data in the reference, name, amount, dateTo and dateFrom columns and want to write them into a CSV file. Currently I've only been able to write the following code and extract the first lines and get rid of the starting and ending double quotes. The input file contains white spaces and so does the output file.

public class ReadTxt {
    public static void main(String[] args) throws IOException {
        BufferedReader br = new BufferedReader(new FileReader("C:/Users/me/Desktop/input.txt"));
        String pattern = "\"\\d\\d\\d\\d";

        // Create a Pattern object
        Pattern r = Pattern.compile(pattern);
        int i;
        ArrayList<String> list = new ArrayList<String>();

        boolean a = true;
        PrintWriter out = new PrintWriter(new PrintWriter("C:/Users/me/Desktop/Output.txt"), a);

        try {
            String line = br.readLine();

            while (line != null) {
                Matcher m = r.matcher(line);

                if (m.find()) {
                    String temp;
                    temp = line.substring(1, line.length() - 1);
                    list.add(temp);
                }
                else {
                // do nothing
                }

                line = br.readLine();
            }
        }
        finally {
            br.close();
        }

        for (i = 0; i < list.size(); i++) {
            out.println(list.get(i));
        }

        out.flush();
        out.close();
    }
}

The above code will create a text file with the following output:

120030125  J Blog   23, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013
120030318  R Mxx    37, WOOD CLOSE, BIRMINGHAM,  121.96  16-OCT-2013  17-NOV-2013  Y                      
120012174  JE xx    25, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013
100154992  DL x     23, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013  Y

My expected output is as following, but into a csv file:

120030125  J Blog  259.44  21-OCT-2013  17-NOV-2013
120030318  R Mxx   121.96  16-OCT-2013  17-NOV-2013                        
120012174  JE xx   259.44  21-OCT-2013  17-NOV-2013
100154992  DL x    259.44  21-OCT-2013  17-NOV-2013  

Any suggestions, links to tutorials or help would be greatly appreciated as I am not an expert in Java. I did try looking up for tutorials on the internet, but could not find any which was useful in my case.

Upvotes: 1

Views: 6374

Answers (2)

Adarsh
Adarsh

Reputation: 3641

public static void main (String[] args) throws IOException {
  BufferedReader br = new BufferedReader (new FileReader ("D:/input.txt"));
  String pattern = "\"\\d\\d\\d\\d";

  // Create a Pattern object
  Pattern r = Pattern.compile (pattern);
  int i;
  ArrayList<String> list = new ArrayList<String> ();

  boolean a = true;
  PrintWriter out = new PrintWriter (new PrintWriter ("D:/Output.csv"), a);

  try {
      String line = br.readLine ();
      line= line.trim ();
      while (line != null) {
      Matcher m = r.matcher (line);
      if (m.find ()) {
          String temp;
          temp = line.substring (0, 19) + " "
                + line.substring (51, line.length () - 1);          
          temp = temp.replaceAll ("[ ]+", " ").replace ("\"", "");
          String[] array = temp.split ("[ ]");
          temp = array[0] +","+ array[1] +" "+ array[2]+","+ array[3]+","+ array[4]+","+ array[5];
          list.add (temp);
      } else {
          // do nothing
      }

      line = br.readLine ();
      }
  }   finally {
      br.close ();
  }

  for (i = 0; i < list.size (); i++) {
      out.println (list.get (i));
  }

  out.flush ();
  out.close ();
  }

OUTPUT

120030125,J Blog,259.44,21-OCT-2013,17-NOV-2013
120030318,R Mxx,121.96,16-OCT-2013,17-NOV-2013
120012174,JE xx,259.44,21-OCT-2013,17-NOV-2013
100154992,DL x,270.44,21-OCT-2013,17-NOV-2013

Upvotes: 1

Paul Samsotha
Paul Samsotha

Reputation: 208964

Here, test this out. I just used a array, but you can implement the necessary code into yours. I changed some addresses (look at 2nd and 3rd address in the array) to have spaces and no spaces in different locations to test.

public class SplitData {

    public static void main(String[] args) {
        String[] array = {"120030125  J Blog   23, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013",
            "120030318  R Mxx    37,WOODCLOSE,BIRMINGHAM,  121.96  16-OCT-2013  17-NOV-2013  Y 0",
            "120012174  JE xx    25, SOME HOUSE,QUEENS       259.44  21-OCT-2013  17-NOV-2013",
            "100154992  DL x     23, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013  Y"  
        };

        String s1 = null;
        String s2 = null;
        String s3 = null;
        String s4 = null;
        String s5 = null;
        for (String s : array) {
            String[] split = s.split("\\s+");
            s1 = split[0];
            s2 = split[1] + " " + split[2];
            for (String string: split) {
                if (string.matches("\\d+\\.\\d{2}")) {
                    s3 = string;
                    break;
                }
            }
            String[] newArray = s.substring(s.indexOf(s3)).split("\\s+");
            s4 = newArray[1];
            s5 = newArray[2];

            System.out.printf("%s\t%s\t%s\t%s\t%s\n", s1, s2, s3, s4, s5);
        }
    }  
}

Output

120030125   J Blog  259.44  21-OCT-2013 17-NOV-2013
120030318   R Mxx   121.96  16-OCT-2013 17-NOV-2013
120012174   JE xx   259.44  21-OCT-2013 17-NOV-2013
100154992   DL x    259.44  21-OCT-2013 17-NOV-2013

Upvotes: 1

Related Questions