Why does llamacpp add a phrase to my prompt and print it out in the output?

Question

I have downloaded shards of gemma2 model from hugging face and then converted them into gguf format via the script from llamacpp repository. Then i tried to run my local gemma2 via llamacpp in the following way:

from llama_cpp import Llama

llm = Llama(
    model_path="/home/s1ngle/.cache/huggingface/hub/models--google--gemma-2-2b-it/snapshots/299a8560bedf22ed1c72a8a11e7dce4a7f9f51f8/299a8560bedf22ed1c72a8a11e7dce4a7f9f51f8-2.6B-299a8560bedf22ed1c72a8a11e7dce4a7f9f51f8-BF16.gguf",
    n_gpu_layers=0,
    n_threads=8,
    n_batch=8,
    n_ctx=8192,
    seed=-1,
    f16_kv=True,
    verbose=False,
    cache=False,

    last_n_tokens_size=64,

)

output = llm(
    "Hi! I'm Bob",
    max_tokens=128,
    echo=False,

    temperature=0,
    top_k=10,
    top_p=0.95,
)

print(output)

and i got the following result in the console:

{'id': 'cmpl-21b4cd4a-58e5-4e76-9782-809e3ef0a731', 'object': 'text_completion', 'created': 1726382333, 'model': '/home/s1ngle/.cache/huggingface/hub/models--google--gemma-2-2b-it/snapshots/299a8560bedf22ed1c72a8a11e7dce4a7f9f51f8/299a8560bedf22ed1c72a8a11e7dce4a7f9f51f8-2.6B-299a8560bedf22ed1c72a8a11e7dce4a7f9f51f8-BF16.gguf', 'choices': [{'text': ', a friendly AI assistant. 👋 

How can I help you today? 😊 
', 'index': 0, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 7, 'completion_tokens': 19, 'total_tokens': 26}}

As you can see my prompt has been added with , a friendly AI assistant. phrase and for some reason this added phrase is printed out in the llm's output.

Why? How can i prevent it?

Why does llamacpp add a phrase to my prompt and print it out in the output?

Answers (0)

Related Questions