Reputation: 365
I am currently using Bulk insert in SQL to import a pipe(|) delimited which is pretty straight forward. My problem I am running into is sometimes there are records that include the pipe(|), which will then in return bulk insert breaks it up into two different records. Below is an example
12343|First Name|Last Name| Address field|Location
63494|Second First Name|Second Last Name| Address Field with | in it |location
My example above, the second record, bulk insert will split the address field because it contains a |. Are there any suggestions out that I could use to avoid problems like this?
Thanks
Upvotes: 0
Views: 1613
Reputation: 664
I've had the same problem before and from my experience there's not much you can do during the import. Obviously if you have any control during the export process from the source you can handle the cleansing of the data at that point, buy most likely is not your case. One thing you can do at least to prevent failures during the import is to validate your input file before the batch insert like I did with a simple code like this:
public class ValidateMigrationFile {
private static final String REGEX = "^([^|]*\\|){50}[^|]*$";
public static void testFile(String fileName) {
int lineCounter = 1;
int totalErrors = 0;
try {
BufferedReader br = new BufferedReader(new FileReader(fileName));
String line = null;
while ((line=br.readLine())!=null) {
// Validate the line is formatted correctly based on regular expressions
if (!line.matches(REGEX)){
System.out.println("Invalid format on line " + lineCounter + " (" + line + ")");
totalErrors++;
}
lineCounter++;
}
br.close();
System.out.println("Total rows processed: " + --lineCounter);
System.out.println("Total errors found: " + totalErrors);
} catch (Exception ex) {
System.out.println("Exception occurred: " + ex.toString());
}
}
}
This way you can detect in advance if your file is well formed and detect exactly what lines have the problem.
Upvotes: 2