Reputation: 51
Currently attempting to write a program in C to read a .bin file. As you can see by my code, I am clearly missing something, I have attempted to read a lot on, but am still completely stuck. As expected, my output is not intended. My expected output example would be YV2840 KCLT KDAB Thu Jan 16 12:44:00 2014
As I am trying to read a .bin file about airline flights. Reasons why I think it could be wrong are as follows.
I am supposed to define a struct called "Human-readable date string". This of course, is not possible, as it will generate a compiler error. Perhaps I am not supposed to take it literally, for now I have it defined as "Time Stamp".
The order and size is not matching the format in which the file is written.
Here is the bin file, if anyone is interested: http://www.filedropper.com/acars Here is my code:
#include <stdio.h>
#include <stdlib.h>
typedef struct MyStruct_struct {
int FlightNum[7];
char OriginAirportCode[5];
char DestAirportCode[5];
int TimeStamp;
} MyStruct;
int main() {
FILE * bin;
MyStruct myStruct;
bin = fopen("acars.bin", "rb");
while(1) {
fread(&myStruct,sizeof(MyStruct),1,bin);
if(feof(bin)!=0)
break;
printf("%d",myStruct.FlightNum);
printf("%s" ,myStruct.OriginAirportCode);
printf("%s" ,myStruct.DestAirportCode);
printf("%d", myStruct.TimeStamp);
}
fclose(bin);
return 0;
}
Upvotes: 3
Views: 10902
Reputation: 84642
If you are going to read binary data into your program, then you need to look and see what you are attempting to read. hexdump
or od
are great tools for looking at data:
$ hexdump -C -n 512 dat/acars.bin
00000000 59 56 32 38 32 37 00 4b 43 4c 54 00 4b 53 52 51 |YV2827.KCLT.KSRQ|
00000010 00 00 00 00 2c 83 d0 52 59 56 32 37 38 32 00 4b |....,..RYV2782.K|
00000020 43 4c 54 00 4b 53 52 51 00 00 00 00 cc 3e ed 52 |CLT.KSRQ.....>.R|
00000030 59 56 32 37 33 32 00 4b 43 4c 54 00 4b 53 52 51 |YV2732.KCLT.KSRQ|
00000040 00 00 00 00 88 f4 d5 52 59 56 32 36 37 35 00 4b |.......RYV2675.K|
00000050 43 4c 54 00 4b 53 52 51 00 00 00 00 20 57 9f 52 |CLT.KSRQ.... W.R|
00000060 59 34 39 38 34 31 00 4b 4d 43 4f 00 4d 4d 4d 58 |Y49841.KMCO.MMMX|
According to your description, you have the flight number, the departure airport, the destination airport and a timestamp. Looking at the data, you find a flight number YV2827
(which is null terminated), you have KCLT
which is the IACO identifier for the Charlotte/Douglass Intl. Airport, next KSRQ
(the IACO identifier for Sarasota, Florida Airport), a couple of bytes of padding followed, finally, by a 4-byte number representing the timestamp. So the data-file makes sense.
Now how to read it? If your description holds, then a structure holding the elements should provide a way to read the data. You may have to work with different members and different attributes to get the padding to work out, but something close to the following should work:
typedef struct {
char flight[7];
char dept[5];
char dest[5];
unsigned tstamp;
} flight;
Next, how to read the file, and store the values in memory in your code. If you don't need to store the values, then a simple read and print of the data will be all you need. Assuming you need to store it to make some actual use of the data, then without knowing how many flights are contained in acars.bin
, you will need a scheme to read/allocate memory to hold the data.
A flexible approach is to use a static buffer to read each flight into, then using malloc
/calloc
allocate an array of pointers to flight, and realloc
as necessary to hold the flight data. Something like:
flight buf = {{0}, {0}, {0}, 0};
flight **flts = NULL;
size_t idx = 0;
size_t nbytes = 0;
...
/* allocate MAXS pointers to flight */
flts = xcalloc (MAXS, sizeof *flts);
/* read into buf until no data read, allocate/copy to flts[i] */
while ((nbytes = fread (&buf, sizeof buf, 1, fp))) {
flts[idx] = calloc (1, sizeof **flts);
memcpy (flts[idx++], &buf, sizeof **flts);
if (idx == maxs) /* if pointer limit reached, realloc */
flts = (flight **)xrealloc_dp((void *)flts, &maxs);
}
Above, the code allocates an initial number of pointers to flight in 'flts' and uses a static struct buf
as a buffer to read data from the acars.bin file. On a read where nbytes
are read and is non-zero, memory is allocated for storage of the buffer in flts[idx]
and memcpy
is used to copy the data from buf
to flts[idx]
. (you should add validation that what is read is actually what you expect).
A standard reallocation scheme is used, having first allocated maxs
pointers to struct, when that number is reached, the number of pointers is reallocated to twice the current amount via xrealloc_dp
(which is a simple reallocation for a double-pointer macro -- you can use a simple function as well) The intent here is just to keep the body of the code clean so the logic isn't obscured by all the realloc
validation code, etc..
Following the complete read of acars.bin, you then have all the values stored in flts
(note the timestamp is stored as an unsigned int
value, so conversion to a calendar time type and formatting the output is left for your output routine). A simple reformatting for output could be:
for (i = 0; i < 10; i++) {
time_t fdate = (time_t)flts[i]->tstamp;
printf (" flight[%4zu] %-8s %-5s %-5s %s", i, flts[i]->flight,
flts[i]->dept, flts[i]->dest, ctime (&fdate));
}
where flts[i]->tstamp
is cast to time_t
and then used with ctime
to provide a formatted date for output along with the rest of the flight data.
Putting all the pieces together, and understanding the xcalloc
and xrealloc_dp
are just simple error check macros for calloc
and realloc
, you could use something like the following. There are 2778
flights contained in acars.bin
and the code below simply prints the data for the first 10 and last 10 flights:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
/* calloc with error check - exits on any allocation error */
#define xcalloc(nmemb, size) \
({ void *memptr = calloc((size_t)nmemb, (size_t)size); \
if (!memptr) { \
fprintf(stderr, "error: virtual memory exhausted.\n"); \
exit(EXIT_FAILURE); \
} \
memptr; \
})
/* realloc with error check - exits on any allocation error */
#define xrealloc_dp(ptr,nmemb) \
({ \
void **p = ptr; \
size_t *n = nmemb; \
void *tmp = realloc (p, 2 * *n * sizeof tmp); \
if (!tmp) { \
fprintf (stderr, "%s() error: virtual memory exhausted.\n", __func__); \
exit (EXIT_FAILURE); \
} \
p = tmp; \
memset (p + *n, 0, *n * sizeof tmp); /* set new pointers NULL */ \
*n *= 2; \
p; \
})
#define MAXS 256
typedef struct {
char flight[7];
char dept[5];
char dest[5];
unsigned tstamp;
} flight;
int main (int argc, char **argv) {
flight buf = {{0}, {0}, {0}, 0};
flight **flts = NULL;
size_t idx = 0;
size_t nbytes = 0;
size_t maxs = MAXS;
size_t i, index;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) {
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
/* allocate MAXS pointers to flight */
flts = xcalloc (MAXS, sizeof *flts);
/* read into buf until no data read, allocate/copy to flts[i] */
while ((nbytes = fread (&buf, sizeof buf, 1, fp))) {
flts[idx] = calloc (1, sizeof **flts);
memcpy (flts[idx++], &buf, sizeof **flts);
if (idx == maxs) /* if pointer limit reached, realloc */
flts = (flight **)xrealloc_dp((void *)flts, &maxs);
}
if (fp != stdin) fclose (fp);
printf ("\n There are '%zu' flights in acars data.\n", idx);
printf ("\n The first 10 flights are:\n\n");
for (i = 0; i < 10; i++) {
time_t fdate = (time_t)flts[i]->tstamp;
printf (" flight[%4zu] %-8s %-5s %-5s %s", i, flts[i]->flight,
flts[i]->dept, flts[i]->dest, ctime (&fdate));
}
printf ("\n The last 10 flights are:\n\n");
index = idx - 10;
for (i = index; i < idx; i++) {
time_t fdate = (time_t)flts[i]->tstamp;
printf (" flight[%4zu] %-8s %-5s %-5s %s", i, flts[i]->flight,
flts[i]->dept, flts[i]->dest, ctime (&fdate));
}
/* free memory */
for (i = 0; i < idx; i++)
free (flts[i]);
free (flts);
return 0;
}
Output
$ ./bin/readacars dat/acars.bin
There are '2778' flights in acars data.
The first 10 flights are:
flight[ 0] YV2827 KCLT KSRQ Fri Jan 10 17:33:00 2014
flight[ 1] YV2782 KCLT KSRQ Sat Feb 1 12:37:00 2014
flight[ 2] YV2732 KCLT KSRQ Tue Jan 14 20:38:00 2014
flight[ 3] YV2675 KCLT KSRQ Wed Dec 4 10:24:00 2013
flight[ 4] Y49841 KMCO MMMX Tue Jul 23 13:25:00 2013
flight[ 5] Y45981 KMCO MMMX Wed Feb 26 13:31:00 2014
flight[ 6] Y45980 MMMX KMCO Tue Mar 25 13:49:00 2014
flight[ 7] Y40981 KMCO MMMX Wed Mar 5 13:23:00 2014
flight[ 8] Y40980 MMMX KMCO Sat Mar 29 11:38:00 2014
flight[ 9] XX0671 KJFK MSLP Tue Mar 25 05:46:00 2014
The last 10 flights are:
flight[2768] 4O2993 KJFK MMMX Wed Feb 12 09:25:00 2014
flight[2769] 1L9221 KSAT KSFB Thu Jan 9 15:41:00 2014
flight[2770] 1L1761 KCID KSFB Tue Jan 14 13:11:00 2014
flight[2771] 1L1625 KABE KSFB Thu Jan 16 10:22:00 2014
flight[2772] 1L0751 KMFE KSFB Thu Jan 16 19:52:00 2014
flight[2773] 1L0697 KTYS KSFB Wed Jan 15 10:21:00 2014
flight[2774] 1L0696 KSFB KTYS Wed Jan 15 07:00:00 2014
flight[2775] 1L0655 KIAG KSFB Fri Jan 17 21:11:00 2014
flight[2776] 1L0654 KSFB KIAG Fri Jan 17 15:49:00 2014
flight[2777] 1L0641 KGFK KSFB Fri Jan 17 14:21:00 2014
Memor Error/Leak Check
In any code your write that dynamically allocates memory, it is imperative that you use a memory error checking program to insure you haven't written beyond your allocated memory and to confirm that you have freed all the memory you have allocated. For Linux valgrind
is the normal choice. There are so many subtle ways to misuse a block of memory that can cause real problems, there is no excuse not to do it. There are similar memory checkers for every platform. They are simple to use. Just run your program through it.
$ valgrind ./bin/readacars dat/acars.bin
==12304== Memcheck, a memory error detector
==12304== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==12304== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==12304== Command: ./bin/readacars dat/acars.bin
==12304==
There are '2778' flights in acars data.
The first 10 flights are:
flight[ 0] YV2827 KCLT KSRQ Fri Jan 10 17:33:00 2014
flight[ 1] YV2782 KCLT KSRQ Sat Feb 1 12:37:00 2014
flight[ 2] YV2732 KCLT KSRQ Tue Jan 14 20:38:00 2014
<snip>
flight[2776] 1L0654 KSFB KIAG Fri Jan 17 15:49:00 2014
flight[2777] 1L0641 KGFK KSFB Fri Jan 17 14:21:00 2014
==12304==
==12304== HEAP SUMMARY:
==12304== in use at exit: 0 bytes in 0 blocks
==12304== total heap usage: 2,812 allocs, 2,812 frees, 134,011 bytes allocated
==12304==
==12304== All heap blocks were freed -- no leaks are possible
==12304==
==12304== For counts of detected and suppressed errors, rerun with: -v
==12304== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
134,011
bytes allocated and All heap blocks were freed -- no leaks are possible confirms you are freeing all memory you allocate. ERROR SUMMARY: 0 errors from 0 contexts confirms there were no inadvertent writes outside the blocks of memory allocated.
Look over the code, let me know if you have any questions and I'll be happy to help further.
Upvotes: 5
Reputation: 133659
Reading binary files is not a simple operation, because they're compiler dependant in the sense that their structure, either for writing or reading, depends on the layout of the struct
that generates the data or used to read it.
In your binary files records look like structured in this way:
0x59563238323700 (flight number 7 bytes)
0x4B434C5400 (original airport 5 bytes)
0x4B53525100 (dest airport 5 bytes)
0x000000 (3 bytes padding)
0x2C83D052 (4 bytes timestamp)
As you can see, the first three fields are 7+5+5 = 17 bytes, but int
data type for timestamp requires 4
bytes alignment in the program that generated that binary data so data is padded to 20
bytes with 0s.
This means that you must make sure that the layout of your struct
is exactly the same of the one that generated that binary data, or read it field by field by taking into account the padding after reversing the original data format.
Upvotes: 1