Robert Höglund
Robert Höglund

Reputation: 10090

Read binary file into a struct

I'm trying to read binary data using C#. I have all the information about the layout of the data in the files I want to read. I'm able to read the data "chunk by chunk", i.e. getting the first 40 bytes of data converting it to a string, get the next 40 bytes.

Since there are at least three slightly different version of the data, I would like to read the data directly into a struct. It just feels so much more right than by reading it "line by line".

I have tried the following approach but to no avail:

StructType aStruct;
int count = Marshal.SizeOf(typeof(StructType));
byte[] readBuffer = new byte[count];
BinaryReader reader = new BinaryReader(stream);
readBuffer = reader.ReadBytes(count);
GCHandle handle = GCHandle.Alloc(readBuffer, GCHandleType.Pinned);
aStruct = (StructType) Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(StructType));
handle.Free();

The stream is an opened FileStream from which I have began to read from. I get an AccessViolationException when using Marshal.PtrToStructure.

The stream contains more information than I'm trying to read since I'm not interested in data at the end of the file.

The struct is defined like:

[StructLayout(LayoutKind.Explicit)]
struct StructType
{
    [FieldOffset(0)]
    public string FileDate;
    [FieldOffset(8)]
    public string FileTime;
    [FieldOffset(16)]
    public int Id1;
    [FieldOffset(20)]
    public string Id2;
}

The examples code is changed from original to make this question shorter.

How would I read binary data from a file into a struct?

Upvotes: 61

Views: 77986

Answers (8)

Kebechet
Kebechet

Reputation: 2425

I had structure:

[StructLayout(LayoutKind.Explicit, Size = 21)]
    public struct RecordStruct
    {
        [FieldOffset(0)]
        public double Var1;

        [FieldOffset(8)]
        public byte var2

        [FieldOffset(9)]
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 12)]
        public string String1;
    }
}

and I received "incorrectly aligned or overlapped by non-object". Based on that I found: https://social.msdn.microsoft.com/Forums/vstudio/en-US/2f9ffce5-4c64-4ea7-a994-06b372b28c39/strange-issue-with-layoutkindexplicit?forum=clr

OK. I think I understand what's going on here. It seems like the problem is related to the fact that the array type (which is an object type) must be stored at a 4-byte boundary in memory. However, what you're really trying to do is serialize the 6 bytes separately.

I think the problem is the mix between FieldOffset and serialization rules. I'm thinking that structlayout.sequential may work for you, since it doesn't actually modify the in-memory representation of the structure. I think FieldOffset is actually modifying the in-memory layout of the type. This causes problems because the .NET framework requires object references to be aligned on appropriate boundaries (it seems).

So my struct was defined as explicit with:

[StructLayout(LayoutKind.Explicit, Size = 21)]

and thus my fields had specified

[FieldOffset(<offset_number>)]

but when you change your struct to Sequentional, you can get rid of those offsets and the error will disappear. Something like:

[StructLayout(LayoutKind.Sequential, Size = 21)]
    public struct RecordStruct
    {
        public double Var1;

        public byte var2;

        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 12)]
        public string String1;
    }
}

Upvotes: 0

Sergey
Sergey

Reputation:

Here is what I am using.
This worked successfully for me for reading Portable Executable Format.
It's a generic function, so T is your struct type.

public static T ByteToType<T>(BinaryReader reader)
{
    byte[] bytes = reader.ReadBytes(Marshal.SizeOf(typeof(T)));

    GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    T theStructure = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
    handle.Free();

    return theStructure;
}

Upvotes: 16

nevelis
nevelis

Reputation: 748

As Ronnie said, I'd use BinaryReader and read each field individually. I can't find the link to the article with this info, but it's been observed that using BinaryReader to read each individual field can be faster than Marshal.PtrToStruct, if the struct contains less than 30-40 or so fields. I'll post the link to the article when I find it.

The article's link is at: http://www.codeproject.com/Articles/10750/Fast-Binary-File-Reading-with-C

When marshaling an array of structs, PtrToStruct gains the upper-hand more quickly, because you can think of the field count as fields * array length.

Upvotes: 7

Robert H&#246;glund
Robert H&#246;glund

Reputation: 10090

I had no luck using the BinaryFormatter, I guess I have to have a complete struct that matches the content of the file exactly. I realised that in the end I wasn't interested in very much of the file content anyway so I went with the solution of reading part of stream into a bytebuffer and then converting it using

Encoding.ASCII.GetString()

for strings and

BitConverter.ToInt32()

for the integers.

I will need to be able to parse more of the file later on but for this version I got away with just a couple of lines of code.

Upvotes: 3

Ronnie
Ronnie

Reputation: 8117

Reading straight into structs is evil - many a C program has fallen over because of different byte orderings, different compiler implementations of fields, packing, word size.......

You are best of serialising and deserialising byte by byte. Use the build in stuff if you want or just get used to BinaryReader.

Upvotes: 0

Ishmaeel
Ishmaeel

Reputation: 14403

The problem is the strings in your struct. I found that marshaling types like byte/short/int is not a problem; but when you need to marshal into a complex type such as a string, you need your struct to explicitly mimic an unmanaged type. You can do this with the MarshalAs attrib.

For your example, the following should work:

[StructLayout(LayoutKind.Explicit)]
struct StructType
{
    [FieldOffset(0)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 8)]
    public string FileDate;

    [FieldOffset(8)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 8)]
    public string FileTime;

    [FieldOffset(16)]
    public int Id1;

    [FieldOffset(20)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 66)] //Or however long Id2 is.
    public string Id2;
}

Upvotes: 36

lubos hasko
lubos hasko

Reputation: 25052

I don't see any problem with your code.

just out of my head, what if you try to do it manually? does it work?

BinaryReader reader = new BinaryReader(stream);
StructType o = new StructType();
o.FileDate = Encoding.ASCII.GetString(reader.ReadBytes(8));
o.FileTime = Encoding.ASCII.GetString(reader.ReadBytes(8));
...
...
...

also try

StructType o = new StructType();
byte[] buffer = new byte[Marshal.SizeOf(typeof(StructType))];
GCHandle handle = GCHandle.Alloc(buffer, GCHandleType.Pinned);
Marshal.StructureToPtr(o, handle.AddrOfPinnedObject(), false);
handle.Free();

then use buffer[] in your BinaryReader instead of reading data from FileStream to see whether you still get AccessViolation exception.

I had no luck using the BinaryFormatter, I guess I have to have a complete struct that matches the content of the file exactly.

That makes sense, BinaryFormatter has its own data format, completely incompatible with yours.

Upvotes: 3

urini
urini

Reputation: 33069

Try this:

using (FileStream stream = new FileStream(fileName, FileMode.Open))
{
    BinaryFormatter formatter = new BinaryFormatter();
    StructType aStruct = (StructType)formatter.Deserialize(filestream);
}

Upvotes: 0

Related Questions