Arthur Chaparyan
Arthur Chaparyan

Reputation: 2025

Extract Text from SWF

We currently use print2flash (http://print2flash.com) to convert user submitted documents (Word documents, RTF, PowerPoint, etc) into Flash-based documents that can be viewed online (a la docstoc and scribd).

We would like to index the text inside these files for full-text indexing. Are there any tools or libraries we can use to accomplish this?

We are developing in ASP.NET / C# and have tried working with 3rd party tools such as SWFTools (http://www.swftools.org) but the results have been inconsistent and subpar.

PS: We would like to do the indexing after the original document has been converted to flash because that gives us fewer file formats to deal with.

Upvotes: 0

Views: 3389

Answers (1)

ashes999
ashes999

Reputation: 10163

Your best bet is a third-party Flash parsing library. Flash has a very dense format and it's painful to parse. Having said that, the format is well-understood. You can find the official specification here: http://www.adobe.com/devnet/swf/

Upvotes: 1

Related Questions