Reputation: 30767

Merge string tensors in TensorFlow

I work with a lot of dtype="str" data. I've been trying to build a simple graph as in https://www.tensorflow.org/versions/master/api_docs/python/train.html#SummaryWriter.

For a simple operation, I wanted to concatenate strings together using a placeholder as in (How to feed a placeholder?)

Does anyone know how to merge string tensors together?

import tensorflow as tf
sess = tf.InteractiveSession()

with tf.name_scope("StringSequence") as scope:
    left = tf.constant("aaa",name="LEFT")
    middle = tf.placeholder(dtype=tf.string, name="MIDDLE")
    right = tf.constant("ccc",name="RIGHT")
    complete = tf.add_n([left,middle,right],name="COMPLETE") #fails here
sess.run(complete,feed_dict={middle:"BBB"})
#writer = tf.train.SummaryWriter("/users/mu/test_out/", sess.graph_def)

Upvotes: 7

Answers (4)

mrry

Reputation: 126194

Thanks to your question, we prioritized adding support for string concatenation in TensorFlow, and added it in this commit. String concatenation is implemented using the existing tf.add() operator, to match the behavior of NumPy's add operator (including broadcasting).

To implement your example, you can write:

complete = left + middle + right

…or, equivalently, but if you want to name the resulting tensor:

complete = tf.add(tf.add(left, middle), right, name="COMPLETE")

We have not yet added support for strings in tf.add_n() (or related ops like tf.reduce_sum()) but will consider this if there are use cases for it.

NOTE: To use this functionality immediately, you will need to build TensorFlow from source. The new op will be available in the next release of TensorFlow (0.7.0).

Upvotes: 18

Guy Coder

Reputation: 24996

I know this is not an immediate answer and don't want this to remain hidden in the comments.

If you'd like to incorporate an operation that isn't covered by the existing library, you can create a custom Op. To incorporate your custom Op, you'll need to:

Register the new Op in a C++ file. The Op registration is independent of the implementation, and describes the semantics of how the Op is invoked. For example, it defines the Op name, and specifies its inputs and outputs.
Implement the Op in C++. This implementation is called a "kernel", and there can be multiple kernels for different architectures (e.g. CPUs, GPUs) or input / output types.
Create a Python wrapper. This wrapper is the public API to create the Op. A default wrapper is generated from the Op registration, which can be used directly or added to.
Optionally, write a function to compute gradients for the Op.
Optionally, write a function that describes the input and output shapes for the Op. This allows shape inference to work with your Op.
Test the Op, typically in Python. If you define gradients, you can verify them with the Python GradientChecker.

What you asked if very relevant and will probably become one of the higher Google search results in the future for using string type with TensorFlow; as such this avenue to a solution needs to made available so that others are aware it exist.

Upvotes: 3