Reputation: 827
I am trying to split a combined text file. The combined text file has multiple xml files inside. I want to split on <?xml version='1.0'?>
which is the start of every new xml inside the combined text file. Not sure what is the best way to do this. Currently this is what I have which does not split correctly.
Updated Code Working (fixed quotation in quotes problem added Pattern.quote):
Scanner scanner = new Scanner( new File("src/main/resources/Flume_Sample"), "UTF-8" );
String combinedText = scanner.useDelimiter("\\A").next();
scanner.close(); // Put this call in a finally block
String delimiter = "<?xml version=\"1.0\"?>";
String[] xmlFiles = combinedText.split("(?="+Pattern.quote(delimiter)+")");
for (int i = 0; i < xmlFiles.length; i++){
File file = new File("src/main/resources/output_"+i);
FileWriter writer = new FileWriter(file);
writer.write(xmlFiles[i]);
System.out.println(xmlFiles[i]);
writer.close();
}
Upvotes: 1
Views: 568
Reputation: 21
I would use something like this if you want to parse the data manually.
public static void parseFile(File file) throws AttributeException, LineException{
BufferedReader br = null;
String s = "";
int counter = 0;
if(file != null){
try{
br = new BufferedReader(new FileReader(file));
while((s = br.readLine()) != null){
if(s.contains("<?xml version='1.0'?>")){
//Write in new file with Stringbuffer and Filewritter.
}
}
br.close();
}catch (IOException e){
System.out.println(e);
}
}
}
Upvotes: 0
Reputation: 313
Be also aware that you will load the entire initial file in memory if you proceed this way. A streamed approach would perform better if the input file is large...
Upvotes: 0
Reputation: 17534
The split method takes a regular expression string, so you may want to escape your delimiter String
to a valid regex :
String[] xmlFiles = combinedText.split(Pattern.quote(delimiter));
See the Pattern.quote method .
Upvotes: 3