Reputation: 151
Due to me receiving a very bad datafile, I have to come up with code to read from a non delimited textfile from a specific starting position and a specific length to buildup a workable dataset. The textfile is not delimited in any way, but I do have the starting and ending position of each string that I need to read. I've come up with this code, but I'm getting an error and can't figure out why, because if I replace the 395 with a 0 it works..
e.g. Invoice number starting position = 395, ending position = 414, length = 20
using (StreamReader sr = new StreamReader(@"\\t.txt"))
{
char[] c = null;
while (sr.Peek() >= 0)
{
c = new char[20];//Invoice number string
sr.Read(c, 395, c.Length); //THIS IS GIVING ME AN ERROR
Debug.WriteLine(""+c[0] + c[1] + c[2] + c[3] + c[4]..c[20]);
}
}
Here is the error that I get:
System.ArgumentException: Offset and length were out of bounds for the array
or count is greater than the number of elements from
index to the end of the source collection. at
System.IO.StreamReader.Read(Char[] b
Upvotes: 2
Views: 19621
Reputation: 3408
395 is the index in c array at which you start writing. There's no 395 index there, max is 19. I would suggest something like this.
StreamReader r;
...
string allFile = r.ReadToEnd();
int offset = 395;
int length = 20;
And then use
allFile.Substring(offset, length)
Upvotes: -2
Reputation: 151
Solved this ages ago, just wanted to post the solution that was suggested
using (StreamReader sr = new StreamReader(path2))
{
string line;
while ((line = sr.ReadLine()) != null)
{
dsnonhb.Tables[0].Columns.Add("InvoiceNum" );
dsnonhb.Tables[0].Columns.Add("Odo" );
dsnonhb.Tables[0].Columns.Add("PumpVal" );
dsnonhb.Tables[0].Columns.Add("Quantity" );
DataRow myrow;
myrow = dsnonhb.Tables[0].NewRow();
myrow["No"] = rowcounter.ToString();
myrow["InvoiceNum"] = line.Substring(741, 6);
myrow["Odo"] = line.Substring(499, 6);
myrow["PumpVal"] = line.Substring(609, 7);
myrow["Quantity"] = line.Substring(660, 6);
Upvotes: 0
Reputation: 3255
Seek()
is too low level for what the OP wants. See this answer instead for line-by-line parsing.
Also, as Jordan mentioned, Seek()
has the issue of character encodings and varying character sizes (e.g. for non-ASCII and non-ANSI files, like UTF, which is probably not applicable to this question). Thanks for pointing that out.
Seek()
is only available on a stream, so try using sr.BaseStream.Seek(..)
, or use a different stream like such:
using (Stream s = new FileStream(path, FileMode.Open))
{
s.Seek(offset, SeekOrigin.Begin);
s.Read(buffer, 0, length);
}
Upvotes: 4
Reputation: 9901
I've created a class called AdvancedStreamReader
into my Helpers
project on git hub here:
https://github.com/jsmunroe/Helpers/blob/master/Helpers/IO/AdvancedStreamReader.cs
It is fairly robust. It is a subclass of StreamReader
and keeps all of that functionality intact. There are a few caveats: a) it resets the position of the stream when it is constructed; b) you should not seek the BaseStream
while you are using the reader; c) you need to specify the newline character type if it differs from the environment and the file can only use one type. Here are some unit tests to demonstrate how it is used.
[TestMethod]
public void ReadLineWithNewLineOnly()
{
// Setup
var text = $"ƒun ‼Æ¢ with åò☺ encoding!\nƒun ‼Æ¢ with åò☺ encoding!\nƒun ‼Æ¢ with åò☺ encoding!\nHa!";
var bytes = Encoding.UTF8.GetBytes(text);
var stream = new MemoryStream(bytes);
var reader = new AdvancedStreamReader(stream, NewLineType.Nl);
reader.ReadLine();
// Execute
var result = reader.ReadLine();
// Assert
Assert.AreEqual("ƒun ‼Æ¢ with åò☺ encoding!", result);
Assert.AreEqual(54, reader.CharacterPosition);
}
[TestMethod]
public void SeekCharacterWithUtf8()
{
// Setup
var text = $"ƒun ‼Æ¢ with åò☺ encoding!{NL}ƒun ‼Æ¢ with åò☺ encoding!{NL}ƒun ‼Æ¢ with åò☺ encoding!{NL}Ha!";
var bytes = Encoding.UTF8.GetBytes(text);
var stream = new MemoryStream(bytes);
var reader = new AdvancedStreamReader(stream);
// Pre-condition assert
Assert.IsTrue(bytes.Length > text.Length); // More bytes than characters in sample text.
// Execute
reader.SeekCharacter(84);
// Assert
Assert.AreEqual(84, reader.CharacterPosition);
Assert.AreEqual($"Ha!", reader.ReadToEnd());
}
I wrote this for my own use, but I hope it will help other people.
Upvotes: -1
Reputation: 3255
(new answer based on comments)
You are parsing invoice data, with each entry on a new line, and the required data is at a fixed offset for every line. Stream.Seek() is too low level for what you want to do, because you will need several seeks, one for every line. Rather use the following:
int offset = 395;
int length = 20;
using (StreamReader sr = new StreamReader(@"\\t.txt"))
{
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
string myData = line.Substring(offset, length);
}
}
Upvotes: 0
Reputation: 12458
Here is my suggestion for you:
using (StreamReader sr = new StreamReader(@"\\t.txt"))
{
char[] c = new char[20]; // Invoice number string
sr.BaseStream.Position = 395;
sr.Read(c, 0, c.Length);
}
Upvotes: 0