Reputation: 1263
I am trying to implement a TensorFlow DNNRegressor which uses a tensor with multiple labels but it keeps failing with an error that I don't understand. I did the 95% of the tests on Tensorflow 1.4.1 and I just switched to 1.5.0 /CUDA 9 but it's still failing (you know, I was just hoping :))
As reference, I used the boston example and the pandas input func source code https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/input_fn/boston.py https://github.com/tensorflow/tensorflow/blob/r1.5/tensorflow/python/estimator/inputs/pandas_io.py
At the following gist you can find the full python code, the produced output, the training data and the (currently unused) test data. The training data and the test data are very small, it is just to build the code. https://gist.github.com/anonymous/c3e9fbe5f5faf373fa230909347318cd
The error message is the following (the stack trace is in the gist, I didn't posted it here to avoid polluting the post)
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [labels shape must be [batch_size, 20]] [Condition x == y did not hold element-wise:] [x (dnn/head/labels/assert_equal/x:0) = ] [20] [y (dnn/head/labels/strided_slice:0) = ] [3] [[Node: dnn/head/labels/assert_equal/Assert/Assert = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT32, DT_STRING, DT_INT32], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](dnn/head/labels/assert_equal/All/_151, dnn/head/labels/assert_equal/Assert/Assert/data_0, dnn/head/labels/assert_equal/Assert/Assert/data_1, dnn/head/labels/assert_equal/Assert/Assert/data_2, dnn/head/logits/assert_equal/x/_153, dnn/head/labels/assert_equal/Assert/Assert/data_4, dnn/head/labels/strided_slice/_155)]]
The input_fn is the following
def get_input_fn(dataset,
model_labels=None,
batch_size=128,
num_epochs=1,
shuffle=None,
queue_capacity=1000,
num_threads=1):
dataset = dataset.copy()
if queue_capacity is None:
if shuffle:
queue_capacity = 4 * len(dataset)
else:
queue_capacity = len(dataset)
min_after_dequeue = max(queue_capacity / 4, 1)
def input_fn():
queue = feeding_functions._enqueue_data(
dataset,
queue_capacity,
shuffle=shuffle,
min_after_dequeue=min_after_dequeue,
num_threads=num_threads,
enqueue_size=batch_size,
num_epochs=num_epochs)
if num_epochs is None:
features = queue.dequeue_many(batch_size)
else:
features = queue.dequeue_up_to(batch_size)
assert len(features) == len(dataset.columns) + 1, ('Features should have one '
'extra element for the index.')
features = features[1:]
features = dict(zip(list(dataset.columns), features))
if model_labels is not None:
#labels = tf.stack([features.pop(model_label) for model_label in model_labels], 0);
labels = [features.pop(model_label) for model_label in model_labels]
return features, labels
return features
return input_fn
I am able to train and to predict with the following input fn but doesn't look fit to handle the amount of data I want to use for the training later. In addition it gets stuck when I use it with the evaluate method.
def get_input_fn(dataset,
model_labels=None):
def input_fn():
features = {k: tf.constant(len(dataset), shape=[dataset[k].size, 1]) for k in model_features}
if model_labels is not None:
labels_data = []
for i in range(0, len(dataset)):
temp = []
for label in model_labels:
temp.append(dataset[label].values[i])
labels_data.append(temp)
labels = tf.constant(labels_data, shape=[len(dataset), len(model_labels)])
return features, labels
else:
return features
return input_fn
Thanks!
Notes: If you check the full code in the gist you will notice that the amount of features and labels depends on the amount of categories, it's built dynamically from the seed data. Probably I could switch to use an RNN and map each epoch to a category instead of building that huge matrix, but currently I am focused on getting this test working.
Upvotes: 1
Views: 763
Reputation: 1263
In the end I slightly changed my approach generating, the test code has been split in prepare.py and train.py, the prepare.py write the data into some CSVs (the input data and the categories) and in the train.py i replaced the input fn with one that load those csv, build a dataset, a parse the dataset lines using tf.read_csv (plus some more additional stuff).
csv_field_defaults = [[0]] * (1 + len(model_features) + len(model_labels))
def _parse_line(line):
fields = tf.decode_csv(line, csv_field_defaults)
# Remove the user id
fields.pop(0)
features = dict(zip(model_features + model_labels,fields))
labels = tf.stack([features.pop(model_label) for model_label in model_labels])
return features, labels
def csv_input_fn(csv_path, batch_size):
dataset = tf.data.TextLineDataset(csv_path).skip(1)
dataset = dataset.map(_parse_line)
dataset = dataset.shuffle(1000).repeat().batch(batch_size)
return dataset.make_one_shot_iterator().get_next()
# Initialize tensor flow
tf.logging.set_verbosity(tf.logging.INFO)
# Initialize the neural network
feature_cols = [tf.feature_column.numeric_column(k) for k in model_features]
regressor = tf.estimator.DNNRegressor(feature_columns=feature_cols,
label_dimension=len(model_labels),
hidden_units=[4096, 2048, 1024, 512],
model_dir="tf_model")
I am currently able to handle 10000 records but I will need to parse way more data, hope that this implementation performs better
The csv_input_fn
is from the tensorflow examples as is while I modified _parse_line
to handle the features and the labels as needed.
Upvotes: 1