Working with LLMs#

Language Models (LLMs) are the intelligence behind agents. Malevich Brain provides a uniform interface for interacting with different LLM providers.

Available LLM Implementations#

Malevich Brain currently includes the following LLM implementations:

OpenAI - Provides access to OpenAI’s models (GPT-4, GPT-3.5, etc.)

Using the OpenAI LLM#

Here’s how to use the OpenAI implementation:

from brain.agents.llm.openai import OpenAIBaseLLM

# Create an OpenAI LLM instance
llm = OpenAIBaseLLM(
    api_key="your-api-key",  # Your OpenAI API key
    base_url=None,  # Optional custom base URL
    default_model="gpt-4o-mini"  # Default model to use
)

Customizing LLM Parameters#

When using the LLM, you can customize various parameters:

async def generate_messages(
    # Message history
    messages,

    # Generation parameters
    max_tokens=None,  # Maximum tokens to generate
    temperature=0.7,  # Controls randomness (0-1)
    top_p=0.9,  # Nucleus sampling parameter

    # Other options
    model="gpt-4o",  # Override the default model
    stream=True,  # Enable streaming responses

    # Tools and files
    tools=[my_tool1, my_tool2],  # Tools available to the LLM
    files=[file1, file2]  # Files provided for context
)

Streaming Responses#

Malevich Brain supports streaming responses from LLMs, which is useful for showing real-time progress to users:

from brain.agents.agent import Agent
from brain.agents.callback import callback

# Create a callback that handles message streaming
@callback("message_stream.assistant")
async def stream_handler(agent, event, stream):
    async for chunk in stream:
        # Print each chunk as it arrives
        if hasattr(chunk, "chunk"):
            print(chunk.chunk, end="", flush=True)

# Create an agent with the streaming callback
agent = Agent(
    llm=llm,
    tools=[],
    callbacks=[stream_handler]
)

# Run the agent
result = await agent.run("Generate a story about a robot")

Implementing a Custom LLM#

You can implement custom LLM providers by extending the BaseLLM class:

from brain.agents.llm.base import BaseLLM
from brain.agents.models import Message, TextMessage

class MyCustomLLM(BaseLLM):
    # Specify supported features
    support_message_streaming = True
    support_file_input = False

    def __init__(self, api_key, **kwargs):
        self.api_key = api_key
        # Initialize your custom LLM client

    def extract_tool_call(self, message):
        # Extract tool call from the message
        # Return (tool_call_id, tool_name, tool_input) or None

    def make_tool_response(self, tool_call_id, tool_output):
        # Create a tool response in the format expected by your LLM

    async def generate_messages(
        self,
        messages,
        max_tokens=None,
        temperature=None,
        top_p=None,
        top_k=None,
        stop=None,
        model=None,
        files=None,
        stream=False,
        tools=None,
        **kwargs
    ):
        # Implement the message generation logic
        # If stream=True, return an async generator
        # Otherwise, return a list of messages

Working with Files#

Some LLMs support file inputs, which can be useful for tasks like document analysis:

from brain.agents.models import LocalFile

# Create a file reference
file = LocalFile(name="path/to/document.pdf")

# Use the file in message generation
messages = await llm.generate_messages(
    messages=[Message(role="user", content="Analyze this document")],
    files=[file]
)