Reputation: 76057
I need to write a small tool that parses a textual input and generates some binary encoded data. I would prefer to stay away from C and the like, in favour of a higher level, (optionally) safer, more expressive and faster to develop language.
My language of choice for this kind of tasks usually is Python, but for this case dealing with binary raw data can be problematic if one isn't very careful with the numbers being promoted to bignums, sign extensions and such.
Ideally I would like to have records with named bitfields that are portable to be serialised in a consistent manner.
(I know that there's a strong point in doing it in a language I already master, although it isn't optimal, but I think this could be a good opportunity to learn something new).
Thanks.
Upvotes: 4
Views: 317
Reputation: 1423
Ada has a great support for this sort of low-level data representation things like you described, in the form of representation clauses for data types. See for example
http://www.adaic.org/standards/05rm/html/RM-13-5-1.html
With the representation clauses it is possible to define the exact layout and alignment (if needed) for all your data, in a portable fashion. Similarly it is very easy to change the representation for example for performance purposes, e.g. using booleans stored as bits vs. machine-addressable words.
Upvotes: 2
Reputation: 78575
C structs are one of the mainstays for this sort of thing. If you don't like the rest of the language, you might be able to define your data formats in C and all your access code in Python and bridge the gap with SWIG. I haven't used SWIG much so I don't know how far you will be able to make it work. If you can't do all the code in Python, you could put little bits (WriteStructToFile, etc.) in C as they can be very very small and well defined.
Upvotes: 0
Reputation: 3977
IMP, it is faster to just use the language that you already know. Unless you want to learn some new language for the sake of fun.
Upvotes: 0
Reputation: 21925
If you wanted to stay in Python an option is the bitstring module, which takes away most of the pain of dealing with binary data.
It's pretty straightforward to construct and parse arbitrary binary structures, so might be worth a look if Erlang doesn't work out for you!
Upvotes: 2
Reputation: 202505
I second the vote for Erlang; despite its oddities, it has excellent support for bit-level control of binary data. (As it must; it's a telecoms language.) Another language worth looking into is PADS, which is a more special-purpose language (also from the telecoms industry) designed for high-speed processing of ad hoc data. I believe PADS supports binary data, but I can't swear to it.
Upvotes: 3
Reputation: 78316
Strangely enough, I think Erlang might fit the bill. Ignoring, unless you want to use them, the parallel facilities, it has native facilities for treating strings of bits very easily. Examine the documentation under the term bit syntax.
Upvotes: 4