Reputation: 29
I am using the code below to break stream into smaller chunks. however, my chunk size is constant and i want it to be a variable. I want program to read till it hit symbol '$'and make '$ position to be chunk size.
For example: lets say txt file contains 01234583145329$34212349$2134567009$, so my 1st chunk size should be 14, second should be 8 and third should be 10. I did some research and find out that can be achieved by indexof method, but I am not able to implement that with the code below. Please advise.
If there is another efficient way other than Indexof, please let me know.
public static IEnumerable<IEnumerable<byte>> ReadByChunk(int chunkSize)
{
IEnumerable<byte> result;
int startingByte = 0;
do
{
result = ReadBytes(startingByte, chunkSize);
startingByte += chunkSize;
yield return result;
}
while (result.Any());
}
public static IEnumerable<byte> ReadBytes(int startingByte, int byteToRead)
{
byte[] result;
using (FileStream stream = File.Open(@"C:\Users\file.txt", FileMode.Open, FileAccess.Read, FileShare.Read))
using (BinaryReader reader = new BinaryReader(stream))
{
int bytesToRead = Math.Max(Math.Min(byteToRead, (int)reader.BaseStream.Length - startingByte), 0);
reader.BaseStream.Seek(startingByte, SeekOrigin.Begin);
result = reader.ReadBytes(bytesToRead);
int chunkSize = Index of
}
return result;
}
static void Main()
{
int chunkSize = 8;
foreach (IEnumerable<byte> bytes in ReadByChunk(chunkSize))
{
//more code
}
}
Upvotes: 0
Views: 808
Reputation: 272760
You seem to care about characters, not bytes here, as you are trying to find $
characters. Just reading bytes will only work in the specific case of "each character is encoded with one byte". Therefore, you should use ReadChar
and return IEnumerable<IEnumerable<char>>
instead.
You seem to be creating a new reader and stream for each chunk, which I feel is quite unnecessary. You could just create one stream and one reader in ReadByChunk
, and pass it to the ReadBytes
method.
The IndexOf
you found is probably for strings. I assume you want to lazily read from a stream, so reading everything into a string first and then using IndexOf
seems to go against your intention.
For a text file, I would also recommend you to use StreamReader
. BinaryReader
is for reading binary files.
Here's my attempt:
public static IEnumerable<IEnumerable<char>> ReadByChunk()
{
using (StreamReader reader = new StreamReader(File.Open(...))) {
while (reader.Peek() != -1) { // while not at the end of the stream...
yield return ReadUntilNextDollarSign(reader);
}
}
}
public static IEnumerable<char> ReadUntilNextDollarSign(StreamReader reader)
{
char c;
// while not at the end of the stream, and the next char is not a dollar sign...
while (reader.Peek() != -1 && (c = (char)reader.Read()) != '$') {
yield return c;
}
}
Upvotes: 3