Reputation: 1769
I have one file and read file line by line and extract particular object from string line.
for example string line is in two format.
VA001748714600006640126132202STRONG 4P 4X44G000099
VA 00174 871460000664 012 6132202 STRONG 4P 4X44G 000099
now i need to extract string and store into my table and fields like below and above two line data generate in below fields(Desire Results).
Code Location SerialNo Quantity ItemNo Description Price
VA 00174 871460000664 12 6132202 STRONG 4P 4X44G0 000099
what i have tried. i have created one method that return object[]
extract from string
public static object[] ProcessLine(string line)
{
var obj = new object[7];
var str = line.Replace("\0", "").Replace(" ", "");
string code = str.Substring(0, 2)?.Trim();
string location = str.Substring(2, 5)?.Trim();
string serialNo = str.Substring(7, 12)?.Trim();
string quantity = str.Substring(19, 3)?.Trim();
int qty = 0;
if (!string.IsNullOrEmpty(quantity))
{
qty = Convert.ToInt32(quantity);
}
string itemNo = str.Substring(22, 7)?.Trim();
Regex MyRegex = new Regex("[^a-z ]", RegexOptions.IgnoreCase);
string description = MyRegex.Replace(line.Substring(2), @"")?.Trim();
string price = str.Substring(str.Length - 6)?.Trim();
obj.SetValue(code, 0);
obj.SetValue(location, 1);
obj.SetValue(serialNo, 2);
obj.SetValue(qty, 3);
obj.SetValue(itemNo, 4);
obj.SetValue(description, 5);
obj.SetValue(price, 6);
return obj;
}
i have find sub-string and store into object, also i can't find Description because this field is not fixed letters.
(Code,Location,SerialNo,Quantity,ItemNo and Price)
are fixed no.of characters and (Description)
fields are any characters or changes.
how to find this fields value and description using regex
i tried to find description but it extract without digit.
Upvotes: 1
Views: 222
Reputation: 626920
You may declare a regex like
private static readonly Regex rx = new Regex(@"^(\w{2})\s*(\w{5})\s*(\w{12})\s*(\d{3})\s*(\d{7})\s*(.*?)\s*(\d{6})$", RegexOptions.Compiled);
See the regex demo.
The point is to use a regex that matches a whole string (^
match the start of a string and $
matches the end of the string), use \w
(any letter/digit/_
chars) or \d
(any digit char), {m}
quantifier to match a certain amount of the chars matched with \w
or \d
, match the Description
field with .*?
, a lazy dot pattern that matches any 0+ chars other than newline as few as possible, and allow any 0+ whitespace chars in between fields with \s*
.
Then, you may use it
public static object[] ProcessLine(string line)
{
object[] obj = null;
var m = rx.Match(line);
if (m.Success)
{
obj = new object[] {
m.Groups[1].Value,
m.Groups[2].Value,
m.Groups[3].Value,
int.Parse(m.Groups[4].Value).ToString(), // remove leading zeros
m.Groups[5].Value,
m.Groups[6].Value,
m.Groups[7].Value
};
}
return obj;
}
See the C# demo, demo output for both the strings in OP:
VA, 00174, 871460000664, 12, 6132202, KING PEPERM E STRONG 4P 4X44G, 000099
VA, 00174, 871460000664, 12, 6132202, KING PEPERM E STRONG 4P 4X44G, 000099
Upvotes: 1
Reputation: 26782
If you really want to use a regex, see Wiktor's answer.
However, you don't need a regex for this problem.
Since all fields except description have known lengths, you can calculate the length of the description field. From your specs the description starts at position 29, and is followed by 6 positions for the price field. Therefore, this should give you the description:
string description = str.Substring(29, str.Length-29-6);
Upvotes: 2