voodoogiant
voodoogiant

Reputation: 2148

good python strategy for parsing grammar-based file format

I've written quite a few simple importers for 3D file formats like PLY and OBJ, which seem to have a very state-based per-line structure making parsing very easy. My friend wanted me to implement a simple importer for a file-type from mirai using python and I noticed there can be a lot of data represented hierarchically, which is different from the simpler line-by-line formats I've used before.

I was wondering if I should try creating a full grammar for this using some python library, some complex regex, or I should just hack up some solution using string-replacement. Can anyone provide any good suggestions for parsing this type of file? This particular example is an exported cube.

 filetype gx;
 GrammarVersion 2.1.0.0;
 TemplateVersion 2.1.0.0;
 HostName "ZOO-HP";
 UserName "Phil";
 TimeStamp "Mon 20-Aug-12, 9:48 pm";
 OSName "Windows NT 6.01.7601";
 ApplicationName "Mirai";
 ApplicationVersion "1.1.0.1 5629";
 include "gbf-2-1-0-0.tpl";
 include "cube_mirai.gmf";


 body Polyhedron-31 (

   vertices[] < (
coord -0.500000 -0.500000 0.500000 ;
 )
 (
coord -0.500000 0.500000 0.500000 ;
 )
 (
coord 0.500000 0.500000 0.500000 ;
 )
 (
coord 0.500000 -0.500000 0.500000 ;
 )
 (
coord 0.500000 -0.500000 -0.500000 ;
 )
 (
coord 0.500000 0.500000 -0.500000 ;
 )
 (
coord -0.500000 0.500000 -0.500000 ;
 )
 (
coord -0.500000 -0.500000 -0.500000 ;
 )
>
   faces[] < (
normal 0.000000 0.000000 1.00000 ;
      vertex-indices[] <0;1;2;3;>
      vertex-normal-indices[] <0;1;2;3;> )
 (
normal 0.000000 0.000000 -1.00000 ;
      vertex-indices[] <4;5;6;7;>
      vertex-normal-indices[] <4;5;6;7;> )
 (
normal 0.000000 1.00000 0.000000 ;
      vertex-indices[] <1;6;5;2;>
      vertex-normal-indices[] <1;6;5;2;> )
 (
normal 0.000000 -1.00000 0.000000 ;
      vertex-indices[] <7;0;3;4;>
      vertex-normal-indices[] <7;0;3;4;> )
 (
normal 1.00000 0.000000 0.000000 ;
      vertex-indices[] <3;2;5;4;>
      vertex-normal-indices[] <3;2;5;4;> )
 (
normal -1.00000 0.000000 0.000000 ;
      vertex-indices[] <7;6;1;0;>
      vertex-normal-indices[] <7;6;1;0;> )
>
   normals[] <-0.577350 -0.577350 0.577350 ;
-0.577350 0.577350 0.577350 ;
0.577350 0.577350 0.577350 ;
0.577350 -0.577350 0.577350 ;
0.577350 -0.577350 -0.577350 ;
0.577350 0.577350 -0.577350 ;
-0.577350 0.577350 -0.577350 ;
-0.577350 -0.577350 -0.577350 ;
>
 )

Upvotes: 0

Views: 269

Answers (1)

Mark Streatfield
Mark Streatfield

Reputation: 3209

To parse such a large construct I would avoid hand-crafting complex regular expressions; they will become too costly to maintain/debug.

I would instead take a look at PyParsing, which has quite a range of examples, or PLY.

Either of these will allow you to parse the file in a more structured way, which should be more maintainable. They will also be easier to extend beyond simple cube examples to cover the full range of the mirai file format.

Upvotes: 1

Related Questions