Reputation: 73
I am trying to read the contents of a Visio Binary .VSD file which contains information from a graph I have made.
I have tried using the OLE Tools and OLEFile but cannot correctly read the contents. I can view the file with the OLETools. When I dump the contents and view it with the 'xxd' command (in terminal) i can't clearly see the text that I saved within the file. There is a lot of extra \x00, \xff etc. and other characters within the file, which when removed make it worse. I've done the exact same with a .doc file and I have been able to open and clearly read the contents.
Can anyone please point me in the correct direction if I am doing this wrong or rather in the direction of other tools that work fine?
Upvotes: 1
Views: 3601
Reputation: 73
Thanks for all the help.
I have found a way to extract plain text from the file and convert it to XHTML and parse that. The main problem is that now I loose any structure the original document may have had.
The tools are libvisio-tools https://launchpad.net/ubuntu/trusty/+package/libvisio-tools
Installing gives you the following programs vsd2xtml, vsd2raw, vsd2text which can be run from terminal to convert the files
Upvotes: 1
Reputation: 12245
You have really picked a strong enemy :)
Unlike other office apps Visio .vsd binary file format is not exactly Microsoft's "compound document", that's basically just a wrapper. The format was created by Visio Corp back in 199x, and AFAIK was never actually publicly documented.
I would really recommend you NOT to go with binary .VSD if possible. Latest Visio supports standard openxml format (.vsdx) which is just a bunch of zipped xml files basically.
AFAIK the only known third-party library to understand binary .vsd is aspose diagrams, but it's not free.
Upvotes: 1