Reputation: 41
I have a 'Text' File from which I have to read data row-by-row. File contains around 1330 Rows. I need to read each row (which is a String) and then split it into substrings which will be inserted as data into database.
The Length of the String that I have to split has approximately 2750 characters. 1 option of splitting this String will be using 'substring(start, end)' method. However, as the line has 2750 characters, the number of splitted strings would be huge around 200 - 225 (I have mapping which suggests certain character length will have what string in Xml).
Can someone suggest any other technique of splitting these strings?
Upvotes: 0
Views: 442
Reputation: 14125
you can use split()
method of String
class to split the string but for that string to be split it has to have some delimiter like comma, dash or something, and using that delimiter you can split the string.
String str = "one-two-three";
String[] temp;
/* delimiter */
String delimiter = "-";
/* given string will be split by the argument delimiter provided. */
temp = str.split(delimiter);
Upvotes: 0
Reputation: 4323
Since you already have the start/end defined and don't seem to even need to parse the string, the substring call is probably the fastest way. The lookups in substring will be hitting array indexes, addresses in memory, so the lookup is probably O(1)... and then maybe Java will copy out the particular string needed, but that's going to have to happen anyway and will only be O(n) even for all substrings if there's no overlap.
substring doesn't actually change the underlying string, it's just going to copy out the relevant portion you're looking for on each call (if it even does that, it would be theoretically possible for it to return a kind of String that encapsulated the original string). Unless you have identified an actual performance problem, the simplest solution is the best one.
If you had to split on, for example, commas, I'd use a CSVReader library.
Upvotes: 0
Reputation: 272217
I suspect that given your numbers, your initial approach would be well within any standard JVM memory constraints.
As ever, premature optimisation is the root of all evil. I would try a simple split, and look to refine it if you have issues. I suspect at 200 strings over a line of 2700 chars that you won't have problems.
Note that the String
object implements a flyweight pattern. That is, substring()
doesn't replicate strings but merely reports back on a window on the original String
's data (char array). Consequently an implementation using substring()
will use very little extra memory (for what it's worth)
Upvotes: 3