Arash Vahabpour
Arash Vahabpour

Reputation: 1

Recency-aware finetuning for question answering

I am looking to fine-tune LLaMA 3 for a question-answering task using specific data that I have collected. Each question-answer pair in my dataset includes a timestamp (in the format MM/DD/YY) ranging from as far back as 2005 to as recent as 2024. While I intend to utilize all the data available, I would like the model to place greater emphasis on the most recent information.

Could you recommend a good strategy to achieve this?

I don't have much ideas except to augment the questions with the information from the timestamp.

Upvotes: 0

Views: 105

Answers (1)

D.lola
D.lola

Reputation: 2274

I would suggest two strategies(as you call it)

  1. What you are asking for is more like trying to provide context to the/your LLaMA3 and not just any context but retrieve latest information(which I dont know where you are getting it from) before the LLaMA3 makes a decision or answers a question, in this case crew-ai here_is_crew-ai_docs seems to be a good choice but you must be quite good with Agents before taking up the task to use it. Having said that, You can inject your LLaMA3 and make it call a custom-agent first before it makes a decision. So you will designate a task to the retrieving-latest-info-agent and it will supply the latest information.

Here is a code snippet to guide you.

from crewai import Agent, Task, Crew, Process
from langchain_community.llms import Ollama

import os

# instantiate the your model , you can use different models
your_llm = Ollama(model="LLaMA3", temperature=0.6, verbose=True)

# using retrieving new information tool/function
search_tool = FunctionThatRetrievestheLatestInfo()


# Agent roles are defined and instantiated here
class AgentRoles:
    def __init__(self):
        self.researcher = Agent(
            role="AgentMars",
            goal=“Information supplier”,
            backstory="You are an agent that provides new information...”,
            verbose=True,
            allow_delegation=False,
            tools=[search_tool],
            llm="LLaMA3",
        )

        self.writer = Agent(
            role="AgentVenus",
            goal="Write the answer“,
            backstory="You are an agent that writes answers“,
            verbose=True,
            allow_delegation=False,
            llm=LLaMA3,
        )
class Tasks:
    def __init__(self):
        self.task1 = Task(
            description="retrieve infh",
            agent=AgentRoles().researcher,
        )

        self.task2 = Task(
            description="write the answer", agent=AgentRoles().writer
        )

class CrewAI:
    def __init__(self):
        self.crew = Crew(
            name="AI Research Crew",
            agents=[AgentRoles().researcher, AgentRoles().writer],
            tasks=[Tasks().task1, Tasks().task2],
            verbose=2,
            process=Process.sequential,
        )

    def start_simulation(self):
        return self.crew.kickoff()
  1. If I was to do something like this and I am new to it, I would check if I am able to intercept the execution of the LLaMA3. If you are using any langchain-library you are in luck all you need do is
    -2a. Create a custom agent-tool see langchain-agent-docs. This will then bind to the/your LLaMA3.

    -2b. You can aslo supply it(LLaMA3) with different tools, the LLaMA3 can use those tools, if you provide a prompt_template that says it needs these tools(info-retriever) for it to perform this task(or answer a question), that is if you are able to intercept the chain of the LLaMA3 (You can look for background info. on Agent-Action, etc..

For the second way I have provided a code snippet to guide you.

from langchain.agents.tools import Tool
def get_function_tools() -> List[Tool]:
    tools = [
        Tool(
            name="SearchTool", #search internet or any-where if it can't get the info
            func=SearchTool.run,
            description="useful for when you need to answer questions on recent events",
        ),
        Tool(
            name="Retrieve-Tool-Agent",
            func=RetrieveTooll().__call__,
            description="useful for when you need to know new things...",
            handle_tool_error=True,
        ),
    ]
    return tools

You would have to see which strategy works best...

Upvotes: 0

Related Questions