LOST
LOST

Reputation: 3250

How to feed sequences to a TensorFlow Keras model?

I'd like to train a model, that assigns some score to a variable-sized sequence of events. Each sequence is in its own file, and I start with a list of (file name, target score).

So I do something like this:

dataset = fileNames.map((fileName, score) => (new CsvDataset(x), score));

What I get is : NotImplementedError : The Dataset.map() transformation does not currently support nested datasets as outputs

I am using TensorFlow 1.10.

Question is: how do I load and feed pairs of (sequence, training score) to a model? Is tf.data even a viable approach?

Upvotes: 1

Views: 395

Answers (1)

Sharky
Sharky

Reputation: 4533

You need to create dataset object prior to using amy map function. Dataset API is a perfectly viable option.

dataset = tf.contrib.data.make_csv_dataset(filenames)

This function is from 1.10 version. You can then use dataset.zip((dataset, labels)) to add labels, or map some parse function using dataset.map()
More on this https://www.tensorflow.org/api_docs/python/tf/data/Dataset
https://www.tensorflow.org/versions/r1.10/api_docs/python/tf/contrib/data/make_csv_dataset

EDIT 1:

If you need to parse file by file you can do 
x = ['1.csv', '2.csv']
y = [label_1, labels_2]

def parse_csv_func(data, label):
    return tf.decode_csv(data, ['float32']*number_of_columns)

dataset = tf.data.Dataset.from_tensor_slices((x, y))
dataset = dataset.map(parse_csv_func)

output: [b'1.csv'] label_1

This dataset object will contain path to csv filename and corresponding label, so you can apply whatever parse function to a separate file you want. If dataset is nested, you can flatten it.

Upvotes: 1

Related Questions