Reputation: 135
I got a little problem here, i'm looking for a better way to split Strings. For example i receive a String looking like this.
0000JHASDF+4429901234ALEXANDER
I know the pattern the string is built with and i have an array of numbers like this.
4,5,4,7,9
0000 - JHASDF - +442 - 9901234 - ALEXANDER
It is easy to split the whole thing up with the String MID command but it seems to be slow when i receive a file containing 8000 - 10000 datasets. So any suggestion how i can make this faster to get the data in a List or an Array of Strings? If anyone knows how to do this for example with RegEx.
Upvotes: 9
Views: 30438
Reputation: 269558
var lengths = new[] { 4, 6, 4, 7, 9 };
var parts = new string[lengths.Length];
// if you're not using .NET4 or above then use ReadAllLines rather than ReadLines
foreach (string line in File.ReadLines("YourFile.txt"))
{
int startPos = 0;
for (int i = 0; i < lengths.Length; i++)
{
parts[i] = line.Substring(startPos, lengths[i]);
startPos += lengths[i];
}
// do something with "parts" before moving on to the next line
}
Upvotes: 12
Reputation: 4509
I know this is late, but in the Microsoft.VisualBasic.FileIO namespace, you can find the textfieldparser and it would do a better job handling your issue. Here is a link to MSDN - https://msdn.microsoft.com/en-us/library/zezabash.aspx with an explanation. The code is in VB, but you can easily convert it to C#. You will need to add a reference to the Microsoft.VisualBasic.FileIO namespace as well. Hope this helps anyone stumbling on this question in the future.
Here is what it would look like in vb for the questioner's issue:
Using Reader As New Microsoft.VisualBasic.FileIO.
TextFieldParser("C:\TestFolder\test.log")
Reader.TextFieldType =
Microsoft.VisualBasic.FileIO.FieldType.FixedWidth
Reader.SetFieldWidths(4, 6, 4, 7, 9)
Dim currentRow As String()
While Not Reader.EndOfData
Try
currentRow = Reader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using
Upvotes: 1
Reputation: 1481
The Regex Split Method would be a possibility, but since you don't have a specific delimiter in the string then I doubt it will be of any use and unlikely to be any faster.
String.Substring is also a possibility. You use it like: var myFirstString = fullString.Substring(0, 4)
Upvotes: 1
Reputation: 4291
As the Mid()
function is VB, you could simply try
string.Substring(0, 4);
and so on.
Upvotes: 1
Reputation: 17964
Isn't mid a VB method?
string firstPart = string.Substring(0, 4);
string secondPart = string.Substring(4, 5);
string thirdPart = string.Substring(9, 4);
//...
Upvotes: 6
Reputation: 108830
Perhaps something like this:
string[] SplitString(string s,int[] parts)
{
string[] result=new string[parts.Length];
int start=0;
for(int i=0;i<parts.Length;i++)
{
int len=parts[i];
result[i]=s.SubString(start, len);
start += len;
}
if(start!=s.Length)
throw new ArgumentException("String length doesn't match sum of part lengths");
return result;
}
(I didn't compile it, so it probably contains some minor errors)
Upvotes: 3