Working with LLMs#
Language Models (LLMs) are the intelligence behind agents. Malevich Brain provides a uniform interface for interacting with different LLM providers.
Available LLM Implementations#
Malevich Brain currently includes the following LLM implementations:
OpenAI - Provides access to OpenAI’s models (GPT-4, GPT-3.5, etc.)
Using the OpenAI LLM#
Here’s how to use the OpenAI implementation:
from brain.agents.llm.openai import OpenAIBaseLLM
# Create an OpenAI LLM instance
llm = OpenAIBaseLLM(
api_key="your-api-key", # Your OpenAI API key
base_url=None, # Optional custom base URL
default_model="gpt-4o-mini" # Default model to use
)
Customizing LLM Parameters#
When using the LLM, you can customize various parameters:
async def generate_messages(
# Message history
messages,
# Generation parameters
max_tokens=None, # Maximum tokens to generate
temperature=0.7, # Controls randomness (0-1)
top_p=0.9, # Nucleus sampling parameter
# Other options
model="gpt-4o", # Override the default model
stream=True, # Enable streaming responses
# Tools and files
tools=[my_tool1, my_tool2], # Tools available to the LLM
files=[file1, file2] # Files provided for context
)
Streaming Responses#
Malevich Brain supports streaming responses from LLMs, which is useful for showing real-time progress to users:
from brain.agents.agent import Agent
from brain.agents.callback import callback
# Create a callback that handles message streaming
@callback("message_stream.assistant")
async def stream_handler(agent, event, stream):
async for chunk in stream:
# Print each chunk as it arrives
if hasattr(chunk, "chunk"):
print(chunk.chunk, end="", flush=True)
# Create an agent with the streaming callback
agent = Agent(
llm=llm,
tools=[],
callbacks=[stream_handler]
)
# Run the agent
result = await agent.run("Generate a story about a robot")
Implementing a Custom LLM#
You can implement custom LLM providers by extending the BaseLLM class:
from brain.agents.llm.base import BaseLLM
from brain.agents.models import Message, TextMessage
class MyCustomLLM(BaseLLM):
# Specify supported features
support_message_streaming = True
support_file_input = False
def __init__(self, api_key, **kwargs):
self.api_key = api_key
# Initialize your custom LLM client
def extract_tool_call(self, message):
# Extract tool call from the message
# Return (tool_call_id, tool_name, tool_input) or None
def make_tool_response(self, tool_call_id, tool_output):
# Create a tool response in the format expected by your LLM
async def generate_messages(
self,
messages,
max_tokens=None,
temperature=None,
top_p=None,
top_k=None,
stop=None,
model=None,
files=None,
stream=False,
tools=None,
**kwargs
):
# Implement the message generation logic
# If stream=True, return an async generator
# Otherwise, return a list of messages
Working with Files#
Some LLMs support file inputs, which can be useful for tasks like document analysis:
from brain.agents.models import LocalFile
# Create a file reference
file = LocalFile(name="path/to/document.pdf")
# Use the file in message generation
messages = await llm.generate_messages(
messages=[Message(role="user", content="Analyze this document")],
files=[file]
)