Reputation: 101

Fine tuning GPT2 for generative question anwering

I am trying to finetune gpt2 for a generative question answering task.

Basically I have my data in a format similar to:

Context : Matt wrecked his car today. Question: How was Matt's day? Answer: Bad

I was looking on the huggingface documentation to find out how I can finetune GPT2 on a custom dataset and I did find the instructions on finetuning at this address: https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling

The issue is that they do not provide any guidance on how your data should be prepared so that the model can learn from it. They give different datasets that they have available, but none is in a format that fits my task well.

I would really appreciate if someone with more experience could help me.

Have a nice day!

Upvotes: 7

Answers (2)

Jay Mody

Reputation: 4073

Your task right now is ambiguous, it could be any of:

QnA via Classification (answer is categorical)
QnA via Extraction (answer is in the text)
QnA via Language Modeling (answer can be anything)

Classification

If all you're examples have Answer: X, where X is categorical (i.e. always "Good", "Bad", etc ...), you can do classification.

In this setup, you'd would have text-label pairs:

Text

Context: Matt wrecked his car today.

Question: How was Matt's day?

Label

Bad

For classification, you're probably better off just fine-tuning a BERT style model (something like RoBERTa).

Extraction

If all you're examples have Answer: X, where X is a word (or consecutive words) in the text (for example), then it's probably best to do a SQuAD-style fine-tuning with a BERT-style model. In this setup, you're input is (basically) text, start_pos, end_pos triplets:

Text

Context: In early 2012, NFL Commissioner Roger Goodell stated that the league planned to make the 50th Super Bowl "spectacular" and that it would be "an important game for us as a league".

Question: Who was the NFL Commissioner in early 2012?

Start Position, End Position

6, 8

Note: The start/end position values of course positions of tokens, so these values will depend on how you tokenize your inputs

In this setup, you're also better off using a BERT-style model. In fact, there are already models on huggingface hub trained on SQuAD (and similar datasets). They should already be good at these tasks out of the box (but you can always fine-tune on top of this).

Language Modeling

If all you're examples have Answer: X, where X can basically be anything (it need not be contained in the text, and is not categorical), then you'd need to do language modeling.

In this setup, you have to use a GPT-style model, and your input would just be the whole text as is:

Context: Matt wrecked his car today.

Question: How was Matt's day?

Answer: Bad

There is no need for labels, since the text itself is the label (we're asking the model to predict the next word, for each word). Larger models like GPT-3 should be good at these tasks without any finetuning (if you give it the right prompt + examples), but of course, these are accessed behind APIs. https://cohere.com also has very strong LLMs and also allow you to fine-tune models (via language modeling), so you don't need to run any code yourself. Not sure how much mileage you'll get with finetuning a smaller model like GPT-2. If this project is for learning, then yeah, definitely go ahead and fine-tune a GPT-2 model! But if performance is key, I highly recommend using a solution like https://cohere.com, which will just work out of the box.

Upvotes: 12

hossein madadi

Reputation: 1

Finetune GPT ##2

I am doing something similar to your task and I think its better to use SQUAD format to fine-tune

Upvotes: -1