Amily
Amily

Reputation: 335

tensorflow, decode_csv with variable data length

I want to write sequence to sequence, using tensorflow. my input data shape is

[input_length, target_length, input , target]

and they have all different lengths. how can I use tf.decode_csv? I tried to make record_defaults with maximum input length. but All shapes must be fully defined in record_defaults............ I can't figure out about this.

    csv_file = tf.train.string_input_producer([file_name], name='file_name')
    reader = tf.TextLineReader()
    _, line = reader.read(csv_file)
    record_defaults = [[0] for row in range(20)]
    data = tf.decode_csv(line,record_defaults=record_defaults,field_delim=',')
    len_error = tf.slice(data,[0],[1])
    len_target = tf.slice(data, [1], [1])
    error = tf.slice(data,[2],len_error)
    target = tf.slice(data, 2+len_error , len_target)

Upvotes: 0

Views: 598

Answers (1)

Peter Hawkins
Peter Hawkins

Reputation: 3211

Yes, tf.decode_csv does require all rows to be the same size. If this does not work for you, consider filing a feature request on Github.

You could also preprocess your CSV file to pad all of the entries out to the same number of columns; you can use the record_defaults argument to tf.decode_csv to leave the fields empty but supply default values.

Upvotes: 1

Related Questions