Llama2 Language Model for Regression (huggingface)

Question

I try to adapt Llama2 to solve a regression task, by utilizing the last hidden state of the model given the entire input sequence.

If the question is then asked "What is the answer to 2+2", it should answer 4 (dummy problem, to explain the issue).

To that end, i will use it in a pytorch model as so

import torch
import torch.nn as nn
from transformers import LlamaModel, LlamaTokenizer

class TransformerModel(nn.Module):
    def __init__(self, model_name:str, additional_layer_size:int = 1):
        super(TransformerModel, self).__init__()
        self.transformer = LlamaModel.from_pretrained(model_name, torch_dtype=torch.float32, cache_dir="hugginface_cache/models")
        self.tokenizer = LlamaTokenizer.from_pretrained(model_name, cache_dir="hugginface_cache/tokenizer")

        # Add an additional layer with one output
        self.additional_layer = nn.Linear(self.transformer.config.hidden_size, additional_layer_size)
        
    def forward(self, input_text):
        # Tokenize input text
        input_ids = self.tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
        print("inptut_ids:", input_ids)

        # Get the outputs from the transformer
        outputs = self.transformer(input_ids)
        
        # Use the entire last hidden state as input to the additional layer
        last_hidden_state = outputs.last_hidden_state
        print('last_hidden_state_shape:', last_hidden_state.size())

        # Apply the additional layer
        additional_output = self.additional_layer(last_hidden_state)

        return additional_output


model_url = "meta-llama/Llama-2-7b-hf"

model = TransformerModel(model_url)

However, for the given input model ("Hello world!") the output is a tensor of size 1,4,1.

I can verify that the tokenizer splits the string into 4 tokens, which i expect to then cause the problem. However, I am not certain how to fix this.

Llama2 Language Model for Regression (huggingface)

Answers (1)

Related Questions