Reputation: 198
I'm trying to get generated text from the TFGPT2Model in the Transformers library. I can see the output tensor, but I'm not able to decode it. Is the tokenizer not compatible with the TF model for decoding?
Code is:
import tensorflow as tf
from transformers import (
TFGPT2Model,
GPT2Tokenizer,
GPT2Config,
)
model_name = "gpt2-medium"
config = GPT2Config.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = TFGPT2Model.from_pretrained(model_name, config=config)
input_ids = tf.constant(tokenizer.encode("Hello, my dog is cute",
add_special_tokens=True))[None, :] # Batch size 1
outputs = model(input_ids)
print(outputs[0])
result = tokenizer.decode(outputs[0])
print(result)
The resulting output is:
$ python run_tf_gpt2.py
2020-04-16 23:43:11.753181: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-04-16 23:43:11.777487: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2020-04-16 23:43:27.617982: W tensorflow/python/util/util.cc:319] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
2020-04-16 23:43:27.693316: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-04-16 23:43:27.824075: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA n
ode, so returning NUMA node zero
...
...
2020-04-16 23:43:38.149860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10565 MB memory) -> physical GPU (device: 1, name: Tesla K80, pci bus id: 0000:25:00.0, compute capability: 3.7)
2020-04-16 23:43:38.150217: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-16 23:43:38.150913: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10565 MB memory) -> physical GPU (device: 2, name: Tesla K80, pci bus id: 0000:26:00.0, compute capability: 3.7)
2020-04-16 23:43:44.438587: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
tf.Tensor(
[[[ 0.671073 0.60760975 -0.10744217 ... -0.51132596 -0.3369941
0.23458953]
[ 0.6403012 0.00396247 0.7443729 ... 0.2058892 -0.43869907
0.2180479 ]
[ 0.5131284 -0.35192695 0.12285632 ... -0.30060387 -1.0279727
0.13515341]
[ 0.3083361 -0.05588413 1.0543617 ... -0.11589152 -1.0487361
0.05204075]
[ 0.70787597 -0.40516227 0.4160383 ... 0.44217822 -0.34975922
0.02535546]
[-0.03940453 -0.1243843 0.40204537 ... 0.04586177 -0.48230025
0.5768887 ]]], shape=(1, 6, 1024), dtype=float32)
Traceback (most recent call last):
File "run_tf_gpt2.py", line 19, in <module>
result = tokenizer.decode(outputs[0])
File "/home/.../transformers/src/transformers/tokenization_utils.py", line 1605, in decode
filtered_tokens = self.convert_ids_to_tokens(token_ids, skip_special_tokens=skip_special_tokens)
File "/home/.../transformers/src/transformers/tokenization_utils.py", line 1575, in convert_ids_to_tokens
index = int(index)
File "/home/.../venv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 853, in __int__
return int(self._numpy())
TypeError: only size-1 arrays can be converted to Python scalars
(I removed all the TF messages and modified paths of my environment)
Upvotes: 2
Views: 2279
Reputation: 36
Apparently, you are using the wrong GPT2-Model. I tried your example by using the GPT2LMHeadModel which is the same Transformer just with a language modeling head on top. It also returns prediction_scores
. In addition to that, you need to use model.generate(input_ids)
in order to get an output for decoding. By default, a greedy search is performed.
import tensorflow as tf
from transformers import (
TFGPT2LMHeadModel,
GPT2Tokenizer,
GPT2Config,
)
model_name = "gpt2-medium"
config = GPT2Config.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = TFGPT2LMHeadModel.from_pretrained(model_name, config=config)
input_ids = tf.constant(tokenizer.encode("Hello, my dog is cute", add_special_tokens=True))[None, :] # Batch size 1
outputs = model.generate(input_ids=input_ids)
print(outputs[0])
result = tokenizer.decode(outputs[0])
print(result)
Upvotes: 2