yiqiang. zhao
yiqiang. zhao

Reputation: 21

How to use tensor2tensor to classify text?

I want to do binary text classification using tensor2tensor only with attention and no LSTM or CNN preprocessing layers. I think that the transformer_encoder model is the best for me,but I can't find any required predifined Problem or Hparams. Can anyone give me a text classification example using tensor2tensor or some other advice?

Upvotes: 2

Views: 1959

Answers (2)

Yuwen Yan
Yuwen Yan

Reputation: 4935

Try this

PROBLEM= sentiment_imdb
MODEL= transformer_encoder
HPARAMS=transformer_tiny

DATA_DIR=$HOME/t2t_data
TMP_DIR=/tmp/t2t_datagen
TRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS

mkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR

# Generate data
t2t-datagen \
  --data_dir=$DATA_DIR \
  --tmp_dir=$TMP_DIR \
  --problem=$PROBLEM

# Train
# *  If you run out of memory, add --hparams='batch_size=1024'.
t2t-trainer \
  --data_dir=$DATA_DIR \
  --problem=$PROBLEM \
  --model=$MODEL \
  --hparams_set=$HPARAMS \
  --output_dir=$TRAIN_DIR

Upvotes: 0

I would recommend following their sentiment_imdb problem, since sentiment analysis is a text-classification problem:

https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/imdb.py

They also have a brief section about training a transformer_encoder for this problem on the main page:

https://github.com/tensorflow/tensor2tensor#sentiment-analysis

Upvotes: 4

Related Questions