Defcon
Defcon

Reputation: 827

Combined Xml String Split Java

I am trying to split a combined text file. The combined text file has multiple xml files inside. I want to split on <?xml version='1.0'?> which is the start of every new xml inside the combined text file. Not sure what is the best way to do this. Currently this is what I have which does not split correctly.

Updated Code Working (fixed quotation in quotes problem added Pattern.quote):

Scanner scanner = new Scanner( new File("src/main/resources/Flume_Sample"), "UTF-8" );
String combinedText = scanner.useDelimiter("\\A").next();
scanner.close(); // Put this call in a finally block
String delimiter = "<?xml version=\"1.0\"?>";
String[] xmlFiles = combinedText.split("(?="+Pattern.quote(delimiter)+")");


for (int i = 0; i < xmlFiles.length; i++){
     File file = new File("src/main/resources/output_"+i);
     FileWriter writer = new FileWriter(file);
     writer.write(xmlFiles[i]);
     System.out.println(xmlFiles[i]);
     writer.close();
}

Upvotes: 1

Views: 568

Answers (3)

adriman2
adriman2

Reputation: 21

I would use something like this if you want to parse the data manually.

    public static void parseFile(File file) throws AttributeException, LineException{
    BufferedReader br = null;
    String s = "";
    int counter = 0;

    if(file != null){
        try{
            br = new BufferedReader(new FileReader(file));
            while((s = br.readLine()) != null){
                if(s.contains("<?xml version='1.0'?>")){
                    //Write in new file with Stringbuffer and Filewritter. 
                }
            }
            br.close();
            }catch (IOException e){
                System.out.println(e);
            }
    }
}

Upvotes: 0

GdR
GdR

Reputation: 313

Be also aware that you will load the entire initial file in memory if you proceed this way. A streamed approach would perform better if the input file is large...

Upvotes: 0

Arnaud
Arnaud

Reputation: 17534

The split method takes a regular expression string, so you may want to escape your delimiter String to a valid regex :

String[] xmlFiles = combinedText.split(Pattern.quote(delimiter));

See the Pattern.quote method .

Upvotes: 3

Related Questions