Reputation: 139
I have the following code fragment that reads a binary file and validates it:
FileStream f = File.OpenRead("File.bin");
MemoryStream memStream = new MemoryStream();
memStream.SetLength(f.Length);
f.Read(memStream.GetBuffer(), 0, (int)f.Length);
f.Seek(0, SeekOrigin.Begin);
var r = new BinaryReader(f);
Single prevVal=0;
do
{
r.ReadUInt32();
var val = r.ReadSingle();
if (prevVal!=0) {
var diff = Math.Abs(val - prevVal) / prevVal;
if (diff > 0.25)
Console.WriteLine("Bad!");
}
prevVal = val;
}
while (f.Position < f.Length);
It unfortunately works very slowly, and I am looking to improve this. In C++, I would simply read the file into a byte array and then recast that array as an array of structures:
struct S{
int a;
float b;
}
How would I do this in C#?
Upvotes: 5
Views: 1756
Reputation: 139
Thank you everyone for very helpful comments and answers. Given this input, this is my preferred solution:
[StructLayout(LayoutKind.Sequential, Pack = 1)]
struct Data
{
public UInt32 dummy;
public Single val;
};
static void Main(string[] args)
{
byte [] byteArray = File.ReadAllBytes("File.bin");
ReadOnlySpan<Data> dataArray = MemoryMarshal.Cast<byte, Data>(new ReadOnlySpan<byte>(byteArray));
Single prevVal=0;
foreach( var v in dataArray) {
if (prevVal!=0) {
var diff = Math.Abs(v.val - prevVal) / prevVal;
if (diff > 0.25)
Console.WriteLine("Bad!");
}
prevVal = v.val;
}
}
}
It indeed works much faster than the original implementation.
Upvotes: 2
Reputation: 109832
This is what we use (compatible with older versions of C#):
public static T[] FastRead<T>(FileStream fs, int count) where T: struct
{
int sizeOfT = Marshal.SizeOf(typeof(T));
long bytesRemaining = fs.Length - fs.Position;
long wantedBytes = count * sizeOfT;
long bytesAvailable = Math.Min(bytesRemaining, wantedBytes);
long availableValues = bytesAvailable / sizeOfT;
long bytesToRead = (availableValues * sizeOfT);
if ((bytesRemaining < wantedBytes) && ((bytesRemaining - bytesToRead) > 0))
{
Debug.WriteLine("Requested data exceeds available data and partial data remains in the file.");
}
T[] result = new T[availableValues];
GCHandle gcHandle = GCHandle.Alloc(result, GCHandleType.Pinned);
try
{
uint bytesRead;
if (!ReadFile(fs.SafeFileHandle, gcHandle.AddrOfPinnedObject(), (uint)bytesToRead, out bytesRead, IntPtr.Zero))
{
throw new IOException("Unable to read file.", new Win32Exception(Marshal.GetLastWin32Error()));
}
Debug.Assert(bytesRead == bytesToRead);
}
finally
{
gcHandle.Free();
}
GC.KeepAlive(fs);
return result;
}
[System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Interoperability", "CA1415:DeclarePInvokesCorrectly")]
[DllImport("kernel32.dll", SetLastError=true)]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool ReadFile
(
SafeFileHandle hFile,
IntPtr lpBuffer,
uint nNumberOfBytesToRead,
out uint lpNumberOfBytesRead,
IntPtr lpOverlapped
);
NOTE: This only works for structs that contain only blittable types, of course. And you must use [StructLayout(LayoutKind.Explicit)] and declare the packing to ensure that the struct layout is identical to the binary format of the data in the file.
For recent versions of C#, you can use Span
as mentioned by Marc in the other answer!
Upvotes: 1
Reputation: 1064134
define a struct
(possible a readonly struct
) with explicit layout ([StructLayout(LayoutKind.Explicit)]
) that is precisely the same as your C++ code, then one of:
unsafe
code on the raw pointer, or use Unsafe.AsRef<YourStruct>
on the data, and Unsafe.Add<>
to iterateT
), and iterate over the spanbyte[]
; create a Span<byte>
over the byte[]
, then use MemoryMarshal.Cast<,>
to create a Span<YourType>
, and iterate over thatbyte[]
; use fixed
to pin the byte*
and get a pointer; use unsafe
code to walk the pointerPipe
that is the buffer, maybe using StreamConnection
on a FileStream
for filling the pipe, and a worker loop that dequeues from the pipe; complication: the buffers can be discontiguous and may split at inconvenient places; solvable, but subtle code required whenever the first span isn't at least 8 bytes(or some combination of those concepts)
Any of those should work much like your C++ version. The 4th is simple, but for very large data you probably want to prefer memory-mapped files
Upvotes: 4
Reputation: 28549
You are actually not using the MemoryStream at all currently. Your BinaryReader accesses the file directly. To have the BinaryReader use the MemoryStream instead:
Replace
f.Seek(0, SeekOrigin.Begin);
var r = new BinaryReader(f);
...
while (f.Position < f.Length);
with
memStream.Seek(0, SeekOrigin.Begin);
var r = new BinaryReader(memStream);
...
while(r.BaseStream.Position < r.BaseStream.Length)
Upvotes: 0