How to fine tune gpt 3.5 on conversational data?

Question

My app provides emotional support to our end users. We have listeners who listen to the end users via chat. Now I have chat threads of one particular listener let say A, and he had talked with 1000 users. I want to fine tune my own model using gpt-3 in a way our listener A reponds to the users.

I have prepared this chat data thread wise, meaning there will be 1000 examples/threads if listener had talked with 1000 users. The problem fine tune allow 4096 tokens per example, so how can I give thread which is very long to be fit in 4000 tokens data limit ?

And what is the best way to train thiis conversation data if my approach is wrong ? I am adding some data sample which I am going to use.


{
  "messages": [
    {
      "role": "system",
      "content": "Marv is a sympathetic human being who provides emoational support"
    },
    {
      "role": "user",
      "content": "Hello"
    },
    {
      "role": "assistant",
      "content": "Hey man, how are ya?   "
    },
    {
      "role": "user",
      "content": "I’m not really good right now "
    },
    {
      "role": "assistant",
      "content": "I’m sorry to hear that.  What’s going on? "
    },
  ]
}
{
  "messages": [
    {
      "role": "system",
      "content": "Marv is a sympathetic human being who provides emoational support"
    },
    {
      "role": "user",
      "content": "I am feeling sad today"
    },
    {
      "role": "assistant",
      "content": "I'm sorry to hear that you're feeling sad. What happened ?"
    },
    {
      "role": "user",
      "content": "My mother passed away today"
    },
    {
      "role": "assistant",
      "content": "Oh... really shoking to hear this. How you holding up now ?"
    }
  ]
}

How to fine tune gpt 3.5 on conversational data?

Answers (1)

Related Questions