siby
siby

Reputation: 777

How to flatten a tensorflow dataset along feature columns when using data.make_csv_dataset?

I am using tf.contrib.data.make_csv_dataset to read CSV files having differing numbers of feature columns.

After reading each file I want to concatenate all the feature columns.

dataset = tf.contrib.data.make_csv_dataset(file_names[0],48,select_columns=['Load_residential_multi_0','Load_residential_multi_1'],shuffle=False)
dataset = dataset.batch(2)
get_batch = dataset.make_one_shot_iterator()
get_batch = get_batch.get_next()
with tf.Session() as sess:
      power_data = sess.run(get_batch)
print(power_data.keys())

Above code will give an ordered dictionary with two keys as shown below:

odict_keys(['Load_residential_multi_0', 'Load_residential_multi_1'])

I can access individual features using the feature names. For example power_data['Load_residential_multi_0'] will give me,

array([[0.075 , 0.1225, 0.0775, 0.12  ],
       [0.0875, 0.1125, 0.095 , 0.1025]], dtype=float32)

However, I want both the feature columns 'Load_residential_multi_0','Load_residential_multi_1'to be concatenated.

I this I can do this using dataset.flatmap(map_func) but I am not sure what I should use as the argument to flatmap().

Upvotes: 0

Views: 732

Answers (1)

Vijay Mariappan
Vijay Mariappan

Reputation: 17191

By using dataset.map you can concat both the dictionary values:

dataset = dataset.map(lambda x: tf.stack(list(x.values())))
get_batch = dataset.make_one_shot_iterator()

Upvotes: 1

Related Questions