Rick Lentz
Rick Lentz

Reputation: 503

Processing TensorFlow Records that are XML (text)

I would like to use TensorFlow to process XML strings that are proper TFRecords. I'm curious to understand how to structure code that parses each TFRecord. There is a set of input rules and data type mappings that are applied to each TFRecord record to produce an output TFRecord.

Example input TFRecord:

<PLANT><COMMON>Shooting Star</COMMON><BOTANICAL>Dodecatheon</BOTANICAL><ZONE>Annual</ZONE><LIGHT>Mostly Shady</LIGHT><PRICE>$8.60</PRICE><EXTREF><REF1><ID>608</ID><TYPE>LOOKUP</TYPE><REF2><ID>703</ID><TYPE>STD</TYPE></EXTREF><AVAILABILITY>051399</AVAILABILITY></PLANT>

The rules show what needs to be parsed and how it needs to be formatted. E.g. find the COMMON, PRICE, EXTREF>REF2>ID and AVAILABILITY elements and export their values as a TFRecord.

Example output TFRecord:

Shooting Star,8.60,703,51399

How do I add this logic to a graph so when it executes it produces the output TFRecord? My initial thoughts are that I need to translate the mapping logic into a series of tf.ops...

Upvotes: 0

Views: 1317

Answers (1)

Yanfeng Liu
Yanfeng Liu

Reputation: 629

I believe this link will be very helpful to you. It specifies the exact format that the TFRecord needs, and it provides the code to turn your own dataset into a TFRecord file.

However, that link did not mention XML files. It only talked about how to create a tf_example and turn it into a TFRecord. This link will actually go a step back and show you how to turn a XML file into a tf_example. Note that it will need some modification to fit your needs because it is using the Oxford Pet Dataset.

Upvotes: 1

Related Questions