Reputation: 1
I am trying to create a chatbot to help one with introspection and journaling for a school project. I essentially want it to be able to summarize a response and ask questions back in a way that uses information from the response as well as be able to try and prompt questions to identify an emotion with the experiences. For example if someone is talking about their day/problems/feelings and states "I am feeling super nervous and my stomach always hurts and I'm always worried", the chatbot would say "Hm often times symptoms a, b, c, are shown with those in anxiety. This is what anxiety is, would you say this accurately describes how you feel?". Stuff like that, but it would only be limited to emotion detection of like 4 emotions.
Anyways I'm trying to figure out a starting point, if I should use a general LLM or a fine tuned one off of huggingface and then apply my own finetunings. I have used some from huggingface but it gives nonsensical responses to my prompts. Is this typical for a bot which has 123M parameters? I tried one with a size of ~6.7B parameters, and it had coherent sentences, but didn't quite make sense as an answer to my statement. Would anyone have any idea if this is typical/recommendations of the route I should take next?
I have tried the LLMs on vscode on my computer directly and have only run smaller models (up to 150million params) and I have run a 6.7b param model through an API from hugging face.
Upvotes: 0
Views: 16
Reputation: 1
I'm going to recommend starting with a larger pre-trained model. Since you’ve already experimented with a model in the 6.7B range; you can stick with something that has a larger number of parameters. It should provide more robust performance for handling complex prompts; even though it's not fully tuned to your exact needs.
Upvotes: 0