salam thabit
salam thabit

Reputation: 1

Using pre-trained transformer with keras

I want to use this pre-trained model: Hate-speech-CNERG/dehatebert-mono-arabic

I use this code to build model using Keras (the library I generally use):

def build_model(transformer, max_len=512):
    """
    function for training the model
    """
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    sequence_output = transformer(input_word_ids)[0]
    cls_token = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(cls_token)
        
    model = Model(inputs=input_word_ids, outputs=out)
    model.compile(Adam(lr=3e-5), loss='binary_crossentropy',
                  metrics=[tf.keras.metrics.AUC()])
    # changed from 1e-5 to 3e-5
    return model

with strategy.scope():
    model_name = "Hate-speech-CNERG/dehatebert-mono-arabic"
    transformer_layer = (
        transformers.AutoModel.from_pretrained(model_name)
    )
    model = build_model(transformer_layer, max_len=MAX_LEN)

The following error happens:

AttributeError                            Traceback (most recent call last)
<ipython-input-19-26bbcd63ea51> in <module>()
      5         # .TFAutoModel.from_pretrained('jplu/tf-xlm-roberta-large')
      6     )
----> 7     model = build_model(transformer_layer, max_len=MAX_LEN)

2 frames
/usr/local/lib/python3.7/dist-packages/transformers/models/bert/modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict)
    922             raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time")
    923         elif input_ids is not None:
--> 924             input_shape = input_ids.size()
    925             batch_size, seq_length = input_shape
    926         elif inputs_embeds is not None:
AttributeError: 'KerasTensor' object has no attribute 'size'

Upvotes: 0

Views: 1734

Answers (1)

ygorg
ygorg

Reputation: 770

The models from huggingfaces can be used out of the box using the transformers library. They can be used with different backends (tensorflow, pytorch).

Using huggingface's transformers with keras is shown here (in the "create_model" function).

Generally speaking you can load a huggingface's transformer using the example code in the model card (the "use in transformers" button):

from transformers import AutoTokenizer, AutoModelForSequenceClassification
  
tokenizer = AutoTokenizer.from_pretrained("Hate-speech-CNERG/dehatebert-mono-arabic")

model = AutoModelForSequenceClassification.from_pretrained("Hate-speech-CNERG/dehatebert-mono-arabic")

Then to proceed to inference the doc shows a way to get the output from a loaded transformer model:

inputs = tokenizer("صباح الخير", return_tensors="pt")
# We're not interested in labels here just the model's inference
# labels = torch.tensor([1]).unsqueeze(0)  # Batch size 1
outputs = model(**inputs) #, labels=labels)

# The model returns logits but we want a probability so we use the softmax function
probs = pytorch.softmax(outputs['logits'], 1)

Upvotes: 0

Related Questions