Lightpack AI System
A unified, explicit, and extensible interface for text generation, multimodal AI (vision and document analysis), structured AI tasks, and semantic search in your Lightpack apps. Supports multiple providers, robust schema validation, and a fluent builder for advanced use cases.
- Purpose: Seamlessly add AI/ML-powered text generation, vision, document analysis, embeddings, and semantic search to any Lightpack project.
- Where to Use: Blog/content generation, summarization, Q&A, code generation, structured data extraction, image analysis, document processing, semantic search, RAG applications, content recommendations, and more.
Lightpack AI exposes four core methods:
ai()->ask(); // Simple question-answer
ai()->task(); // Structured data extraction, multimodal AI, tools
ai()->embed(); // Text to vector embeddings
ai()->similar(); // Semantic similarity search
Supported Providers
| Driver | Class | Text Generation | Vision | Documents | Embeddings |
|---|---|---|---|---|---|
openai |
Providers\OpenAI |
✅ GPT-4o | ✅ GPT-4o | ✅ PDFs | ✅ text-embedding-3-small |
gemini |
Providers\Gemini |
✅ Gemini 2.0 | ✅ Gemini 2.0 | ✅ All formats | ✅ text-embedding-004 (FREE) |
anthropic |
Providers\Anthropic |
✅ Claude Sonnet | ✅ Claude Sonnet | ✅ PDFs | ❌ Not supported |
groq |
Providers\Groq |
✅ Llama 3.2 Vision | ✅ Llama 3.2 Vision | ❌ Not supported | ❌ Not supported |
mistral |
Providers\Mistral |
✅ Mistral models | ❌ Not supported | ❌ Not supported | ✅ mistral-embed |
Add your own: Implement ProviderInterface and register in config.
Configuration
Please run following command to create config/ai.php configuration file.
php console create:config --support=ai
Recommended Models
For text-only tasks:
'providers' => [
'openai' => ['model' => 'gpt-4o-mini'], // Fast, cost-effective
'anthropic' => ['model' => 'claude-sonnet-4-5'],
'gemini' => ['model' => 'gemini-2.0-flash'], // FREE tier available
'groq' => ['model' => 'llama-3.1-8b-instant'], // Ultra-fast
]
For vision/multimodal tasks:
'providers' => [
'openai' => ['model' => 'gpt-4o'], // Best for PDFs
'anthropic' => ['model' => 'claude-sonnet-4-5'], // Native PDF support
'gemini' => ['model' => 'gemini-2.0-flash'], // All document formats
'groq' => ['model' => 'llama-3.2-11b-vision-preview'], // Fast vision
]
Note: Vision models work for both text and multimodal tasks. Use text-only models when you don't need vision capabilities to save costs.
Usage
| Method | Use When | Returns |
|---|---|---|
ask() |
Simple questions, plain text answers | String |
task() |
Structured data extraction, tool calling, streaming | Array with success, data, errors, raw |
embed() |
Convert text to vector embeddings | Array of floats (single) or array of arrays (batch) |
similar() |
Find semantically similar items | Array of matches with similarity scores |
Quick Decision Guide:
- Need a quick answer? →
ask() - Need to analyze an image? →
task()withimage() - Need to analyze a PDF/document? →
task()withdocument() - Need JSON with specific fields? →
task()withexpect() - Need real-time streaming output? →
task()withstream() - Need to call ONE function/API? →
task()withtool() - Need to chain MULTIPLE tools? →
task()withtool()+loop() - Need AI to solve complex problems? →
task()withloop()+goal() - Need semantic search? →
embed()+similar()
ask()
Use for: Quick questions that need plain text answers.
For simple, one-off questions, use the ask() method:
$answer = ai()->ask('What is the capital of France?');
echo $answer; // "Paris"
- Returns the raw answer as a plain string.
task()
Use for: Extracting structured data with type validation and required field checks.
$result = ai()->task()
->prompt('Who created Monalisa and at what age?')
->expect(['name' => 'string', 'age' => 'int'])
->required('name', 'age')
->run();
if ($result['success']) {
echo $result['data']['name']; // "Leonardo da Vinci"
echo $result['data']['age']; // 51
} else {
print_r($result['errors']); // ["Missing required field: name"]
}
Key methods:
prompt(string)- Set the questiontext(string)- Add text content (alias for prompt)image(base64, mimeType)- Add image for vision analysisimageUrl(url)- Add image from URLdocument(base64, mimeType)- Add document (PDF, etc.) for analysisexpect(array)- Define JSON schema with typesrequired(...fields)- Mark fields as requiredexpectArray(key)- Expect array of objectsexample(array)- Provide custom example for schema (optional, auto-generated by default)message(role, content)- Add message to conversation historysystem(string)- Set system promptmodel(string)- Override modeltemperature(float)- Set randomness (0.0-2.0)maxTokens(int)- Limit response lengthcache(bool)- Enable cachingcacheTtl(int)- Cache duration in secondstool(name, fn, description, params)- Register a toolloop(int)- Enable multi-turn agent mode (default: 10 turns)goal(string)- Set explicit goal for agent to achievestream(callback)- Stream response in real-timerun()- Execute and return['success', 'data', 'errors', 'raw']
Example Recipes
1. Validate Array of Objects
$result = ai()->task()
->prompt('List 2 movies with title, rating, and summary.')
->expect(['title' => 'string', 'rating' => 'int', 'summary' => 'string'])
->required('title', 'rating', 'summary')
->expectArray('movie')
->run();
if (!$result['success']) {
// $result['errors'] contains missing fields per item
}
2. Use Conversation History
$result = ai()->task()
->message('system', 'You are a helpful assistant.')
->message('user', 'How do I reset my password?')
->run();
3. Control Temperature and Tokens
$result = ai()->task()
->prompt('Generate 3 product names')
->temperature(0.9) // More creative (0.0 = deterministic, 2.0 = very random)
->maxTokens(100) // Limit response length
->run();
4. Custom Schema Examples
$result = ai()->task()
->prompt('Extract user info from: John Smith, 35 years old')
->expect(['name' => 'string', 'age' => 'int', 'active' => 'bool'])
->example(['name' => 'John Doe', 'age' => 25, 'active' => true])
->run();
Note: Custom examples are optional. The framework auto-generates examples from your schema that work well for most cases.
tool()
Use for: Giving AI access to external functions, APIs, or data sources.
Tools allow AI to call PHP functions to fetch data, perform calculations, or interact with your application. The AI decides which tool to call based on the user's question.
Basic Example:
$result = ai()->task()
->tool('search_products', function($params) {
return db()->table('products')
->where('name', 'LIKE', "%{$params['query']}%")
->limit(5)
->all();
}, 'Search products by name', ['query' => 'string'])
->prompt('Find laptops')
->run();
if ($result['success']) {
echo $result['raw']; // AI's natural language answer
print_r($result['tools_used']); // ['search_products']
print_r($result['tool_results']); // ['search_products' => [...]]
}
Tool Definition:
->tool(
string $name, // Tool identifier
mixed $fn, // Closure, invokable object, or class string
?string $description, // What the tool does (helps AI decide)
array $params // Parameter schema: ['param' => 'type']
)
Supported parameter types:
'string'- Text'int'- Integer'number'- Float/decimal'bool'- Boolean'array'- Array
Parameter schema formats:
// Simple: just type
['query' => 'string', 'limit' => 'int']
// With description (helps AI understand)
['query' => ['string', 'Search term'], 'limit' => ['int', 'Max results']]
// List format (defaults to string)
['query', 'category'] // Both become 'string' type
Tool Examples
1. Multiple Tools
$result = ai()->task()
->tool('get_weather', function($params) {
return http()->get("api.weather.com/{$params['city']}");
}, 'Get current weather', ['city' => 'string'])
->tool('get_forecast', function($params) {
return http()->get("api.weather.com/forecast/{$params['city']}");
}, 'Get 7-day forecast', ['city' => 'string'])
->prompt('What is the weather in Paris?')
->run();
// AI should choose 'get_weather' (not 'get_forecast')
2. Invokable Tool Classes
For complex tools, use invokable classes instead of closures. This keeps your code organized and allows the tool to define its own description and parameters.
Scaffold a tool class:
php console create:tool SearchProducts
This creates app/Tools/SearchProducts.php with the ToolInterface already implemented.
stream()
Use for: Real-time streaming of AI responses as they're generated.
Streaming allows you to display AI output progressively as it's being generated, rather than waiting for the complete response. This creates a better user experience for long-form content like essays, blog posts, or chat interfaces.
The stream() method has two modes:
- HTTP/SSE Mode (default) - Returns
Responsefor Server-Sent Events - Callback Mode - Accepts callback for CLI scripts and testing
HTTP/SSE Mode (Primary Use Case)
Use in controllers to stream AI responses to the browser using Server-Sent Events (SSE).
Controller Example:
public function stream()
{
return ai()->task()
->prompt('Write a blog post about PHP frameworks')
->stream();
}
- Returns a
Responseobject configured for SSE streaming - Sets SSE headers automatically (
Content-Type: text/event-stream, etc.) - Streams events in format:
data: {"event":"chunk","text":"..."}\n\n - Sends
doneevent when complete
Frontend Integration:
<div id="output"></div>
<script>
const eventSource = new EventSource('/stream');
const output = document.getElementById('output');
eventSource.addEventListener('message', (e) => {
const data = JSON.parse(e.data);
if (data.event === 'chunk') {
output.textContent += data.text;
} else if (data.event === 'done') {
eventSource.close();
console.log('Stream complete');
}
});
eventSource.addEventListener('error', (e) => {
console.error('Stream error:', e);
eventSource.close();
});
</script>
SSE Event Format:
// Chunk event
data: {"event":"chunk","text":"Hello "}
// Chunk event
data: {"event":"chunk","text":"world!"}
// Done event
data: {"event":"done"}
Callback Mode (CLI/Testing)
Use in CLI scripts, background jobs, or tests where you need direct access to chunks.
CLI Example:
// console command or script
ai()->task()
->prompt('Write a blog post about PHP frameworks')
->stream(function($chunk) {
echo $chunk;
flush();
});
When to Use Which Mode
| Scenario | Mode | Example |
|---|---|---|
| Web controller | HTTP/SSE | return ai()->task()->stream(); |
| Chat interface | HTTP/SSE | Real-time browser updates |
| CLI command | Callback | stream(fn($chunk) => echo $chunk) |
| Background job | Callback | Process chunks without HTTP |
| Unit tests | Callback | Capture chunks for assertions |
Key Points
- Incompatible with:
expect(),expectArray(),tool(),loop() - Compatible with:
prompt(),system(),temperature(),maxTokens(),model() - All providers support streaming (OpenAI, Anthropic, Groq, Mistral, Gemini)
- HTTP mode uses Server-Sent Events (SSE) standard
- Callback mode is synchronous - blocks until complete
Validation Errors
Streaming throws exceptions if used with incompatible features:
// ❌ ERROR: Cannot use streaming with schema extraction
ai()->task()
->prompt('Extract data')
->expect(['name' => 'string'])
->stream();
// Throws: "Streaming is not supported with schema extraction"
// ❌ ERROR: Cannot use streaming with tools
ai()->task()
->tool('search', $fn)
->prompt('Search products')
->stream();
// Throws: "Streaming is not supported with tools"
// ❌ ERROR: Cannot use streaming with agent mode
ai()->task()
->loop(5)
->prompt('Research topic')
->stream();
// Throws: "Streaming is not supported in agent mode"
When to Use Streaming
| Use Case | Streaming | Regular (run()) |
|---|---|---|
| Long blog posts | ✅ Better UX | ❌ User waits |
| Chat interfaces | ✅ Real-time feel | ❌ Delayed response |
| Code generation | ✅ Progressive display | ❌ All at once |
| Data extraction | ❌ Need structured output | ✅ Use expect() |
| Tool calling | ❌ Need function results | ✅ Use tool() |
| Multi-step tasks | ❌ Need agent mode | ✅ Use loop() |
embed()
Use for: Converting text into vector embeddings for semantic search, similarity matching, or RAG applications.
// Single text
$embedding = ai()->embed('wireless headphones');
// Returns: [0.123, -0.456, 0.789, ...] (768-1536 floats)
// Batch processing (efficient - single API call)
$texts = [
'Product A description',
'Product B description',
'Product C description'
];
$embeddings = ai()->embed($texts);
// Use batch results - CRITICAL: maintains same order as input!
foreach ($texts as $i => $text) {
echo "Text: {$text}\n";
echo "Embedding: " . json_encode($embeddings[$i]) . "\n";
}
Method signature:
ai()->embed(
string|array $input, // Required: Single text or array of texts
array $options = [] // Optional: Provider-specific options
): array
Options parameter:
$options = [
'model' => 'text-embedding-3-small', // Override default model
// Provider-specific options vary
];
Returns:
- Single input:
arrayof floats (e.g., 768 or 1536 dimensions) - Batch input:
arrayof arrays (one embedding per input text, same order as input)
Key points:
- Returns array of floats (vector representation of text)
- Batch processing uses single API call (cost-efficient)
- Batch results maintain same order as input array
- Store embeddings in database for reuse
- Dimensions vary by provider (768 for Gemini, 1536 for OpenAI)
- Not all providers support embeddings (see provider table above)
- Options parameter allows model override and provider-specific settings
- Embeddings are NOT cross-compatible. Always use the same provider for embedding and searching.
similar()
Use for: Finding semantically similar items.
$queryEmbedding = ai()->embed('laptop for programming');
$results = ai()->similar($queryEmbedding, $products);
foreach ($results as $result) {
echo $result['id']; // Product ID
echo $result['similarity']; // 0.0-1.0 score
echo $result['item']; // Original item data
}
Method signature:
ai()->similar(
array $queryEmbedding, // Required: Query vector
mixed $target, // Required: Array of items (in-memory) or collection name (vector DB)
int $limit = 5, // Optional: Max results (default: 5)
float $threshold = 0.0 // Optional: Min similarity score (default: 0.0)
): array
Returns:
[
[
'id' => mixed, // Item identifier
'similarity' => float, // Score 0.0-1.0
'item' => array // Original item data
],
// ... more results
]
Key points:
- Returns exact matches (100% recall, not approximate)
- Works in-memory by default (fast for < 5K items)
- Scores range from 0.0 (different) to 1.0 (identical)
- Results sorted by similarity (highest first)
- Threshold defaults to 0.0 (returns all results) - set to 0.6-0.8 for quality filtering
- Extensible via
setVectorSearch()for custom implementations
Threshold recommendations:
0.0(default) - Return all results, let user decide0.6-0.7- Moderate similarity (related items)0.8-0.9- High similarity (very similar items)0.95+- Near-identical items
Example Recipes 1: Storing Embeddings
1. Store Product Embeddings
// In your model
class Product extends Model
{
protected $casts = [
'embedding' => 'array'
];
}
// store the embeddings for the product
$product->embedding = ai()->embed($product->description);
$product->save();
2. Semantic Product Search
// fetch products with their embeddings
$items = Product::query()
->select('id', 'embedding')
->whereNotNull('embedding')
->all()
->map(fn($p) => [
'id' => $p->id,
'embedding' => $p->embedding
]);
// semantic search for similar products
$query = 'best laptop for programming';
$queryEmbedding = ai()->embed($query);
$results = ai()->similar($queryEmbedding, $items, limit: 10);
// process results
foreach ($results as $result) {
// Product ID: $result['id'];
}
Example Recipes 2: Filtering and RAG
1. Filter by Similarity Threshold
// Only return matches with 70%+ similarity
$results = ai()->similar($queryEmbedding, $items, limit: 10, threshold: 0.7);
2. RAG (Retrieval Augmented Generation)
// Find relevant docs
$userQuestion = 'How do I reset my password?';
$queryEmbedding = ai()->embed($userQuestion);
$relevant = ai()->similar($queryEmbedding, $docs, limit: 3);
// Build context
$context = implode("\n\n", array_column($relevant, 'item'));
// Ask AI with context
$answer = ai()->task()
->system("Answer based on this documentation:\n\n{$context}")
->prompt($userQuestion)
->run();
3. Content Recommendations
$articles = Article::query()
->select('id', 'embedding')
->whereNotNull('embedding')
->all()
->map(fn($a) => [
'id' => $a->id,
'embedding' => $a->embedding
]);
// "Users who read this also read..."
$similar = ai()->similar($article->embedding, $articles, limit: 5);
Multimodal AI (Vision & Documents)
Use for: Analyzing images, PDFs, and documents with AI vision capabilities.
Lightpack AI supports multimodal inputs, allowing you to combine text prompts with images and documents. The framework automatically handles provider-specific formats.
Image Analysis
Analyze an image from file path (recommended):
$result = ai()->task()
->text('What items are in this receipt and what is the total?')
->attachImage('receipt.jpg') // Auto-detects MIME type and encodes
->run();
echo $result['raw']; // "The receipt contains: Coffee $4.50, Sandwich $8.99. Total: $13.49"
Analyze an image from base64:
// Manual encoding (use attachImage() instead for convenience)
$imageData = base64_encode(file_get_contents('receipt.jpg'));
$result = ai()->task()
->text('What items are in this receipt and what is the total?')
->image($imageData, 'image/jpeg')
->run();
Analyze an image from URL:
$result = ai()->task()
->text('Describe this image in detail')
->imageUrl('https://example.com/photo.jpg')
->run();
Extract structured data from images:
$result = ai()->task()
->text('Extract the business card information')
->attachImage('business-card.png')
->expect(['name' => 'string', 'email' => 'string', 'phone' => 'string', 'company' => 'string'])
->required('name', 'email')
->run();
if ($result['success']) {
echo $result['data']['name']; // "John Smith"
echo $result['data']['email']; // "john@example.com"
echo $result['data']['company']; // "Acme Corp"
}
Document Analysis
Analyze a PDF document from file path (recommended):
$result = ai()->task()
->text('Summarize this invoice and extract key details')
->attachDocument('invoice.pdf') // Auto-detects MIME type and encodes
->run();
echo $result['raw'];
Analyze a PDF from base64:
// Manual encoding (use attachDocument() instead for convenience)
$pdfData = base64_encode(file_get_contents('invoice.pdf'));
$result = ai()->task()
->text('Summarize this invoice and extract key details')
->document($pdfData, 'application/pdf')
->run();
Extract structured data from documents:
$result = ai()->task()
->text('Extract invoice details')
->attachDocument('invoice.pdf')
->expect([
'invoice_number' => 'string',
'date' => 'string',
'vendor' => 'string',
'total' => 'number',
'items' => 'array'
])
->required('invoice_number', 'total')
->run();
if ($result['success']) {
echo $result['data']['invoice_number']; // "INV-2024-001"
echo $result['data']['total']; // 1250.00
}
Combine multiple images:
$result = ai()->task()
->text('Compare these two product images and list the differences')
->attachImage('product-v1.jpg')
->attachImage('product-v2.jpg')
->run();
Supported MIME types:
- Images:
image/jpeg,image/png,image/gif,image/webp - Documents:
application/pdf(OpenAI, Anthropic, Gemini)
Agent Mode: Multi-Turn Problem Solving
Use for: Complex tasks that require multiple steps, tool calls, or reasoning cycles.
What is Agent Mode?
Agent mode allows AI to work through problems step-by-step across multiple "turns" until it achieves the goal. Each turn, the agent can:
- Call a tool to get data
- Analyze the results
- Decide what to do next
- Stop when the goal is achieved
Think of it like this:
- Single-turn (default): AI does ONE thing and stops
- Agent mode: AI does MULTIPLE things until problem solved
When to Use Agent Mode
| Scenario | Single-Turn | Agent Mode |
|---|---|---|
| "What's the weather?" | ✅ One API call | ❌ Overkill |
| "Find cheapest flight" | ❌ Returns 50 options | ✅ Searches → filters → compares → picks best |
| "Debug this error" | ❌ Shows error | ✅ Reads logs → traces code → finds root cause |
| "Analyze sales trends" | ❌ Dumps data | ✅ Queries → analyzes → identifies patterns |
Basic Agent Example
Without Agent Mode (Single-Turn):
$result = ai()->task()
->tool('search_products', $searchFn)
->prompt('Find laptops under $1000')
->run();
// Returns: 50 laptops
// Problem: User has to manually filter through them
With Agent Mode (Multi-Turn):
$result = ai()->task()
->tool('search_products', $searchFn)
->tool('filter_by_price', $filterFn)
->tool('check_reviews', $reviewFn)
->tool('compare_specs', $compareFn)
->loop(5) // Allow up to 5 thinking cycles
->goal('Find the BEST laptop under $1000')
->prompt('Find the best laptop under $1000')
->run();
// Agent automatically:
// Turn 1: search_products() → 50 laptops
// Turn 2: filter_by_price(max: 1000) → 12 laptops
// Turn 3: check_reviews(min_rating: 4) → 5 laptops
// Turn 4: compare_specs() → picks top 2
// Turn 5: Returns: "I recommend the Dell XPS 13 because..."
Agent Mode Methods
loop(int $maxTurns = 10)
Enable multi-turn mode with a maximum number of attempts.
->loop(5) // Allow up to 5 turns
->loop(10) // Allow up to 10 turns (default)
->loop(20) // For complex research tasks
How to choose the number:
- Simple tasks (2-3 turns): Quick operations like "get data → format"
- Medium tasks (5-10 turns): Multi-step processes like "search → filter → compare"
- Complex tasks (10-20 turns): Deep analysis like "research → analyze → synthesize"
- Research tasks (20-50 turns): Comprehensive investigations
Safety: Agent automatically stops when:
- ✅ Goal is achieved
- ✅ No more tools needed (got final answer)
- ✅ Max turns reached (prevents infinite loops)
goal(string $goal)
Set an explicit objective for the agent to work toward.
->goal('Find the root cause of the bug')
->goal('Identify what is driving sales growth')
->goal('Create a complete market analysis')
With goal: Agent knows when to stop (when goal achieved) Without goal: Agent stops when no more tools are needed
Real-World Examples
Example 1: E-Commerce Product Finder
Problem: User wants the best option, not just a list.
$result = ai()->task()
->tool('search_products', function($params) {
return db()->table('products')
->where('category', '=', $params['category'])
->all();
})
->tool('filter_by_price', function($params) {
return db()->table('products')
->where('price', '<=', $params['max_price'])
->all();
})
->tool('get_reviews', function($params) {
return db()->table('reviews')
->where('product_id', '=', $params['product_id'])
->avg('rating');
})
->loop(7)
->goal('Find the best value laptop for programming')
->prompt('I need a laptop for coding, budget $1200')
->run();
if ($result['goal_achieved']) {
echo $result['raw']; // "I recommend the ThinkPad X1 Carbon..."
// See what the agent did
foreach ($result['agent_memory'] as $turn) {
echo "Turn {$turn['turn']}: {$turn['content']}\n";
}
}
What the agent does:
- Searches laptops category
- Filters by price ≤ $1200
- Checks reviews for top options
- Compares specs (RAM, CPU, storage)
- Picks best value
- Explains recommendation
Example 2: Customer Support Automation
Problem: Resolve issues without human intervention.
$result = ai()->task()
->tool('get_order_status', function($params) {
return db()->table('orders')
->where('id', '=', $params['order_id'])
->first();
})
->tool('track_shipment', function($params) {
return http()->get("shipping-api.com/track/{$params['tracking_number']}");
})
->tool('estimate_delivery', function($params) {
return http()->get("shipping-api.com/estimate/{$params['tracking_number']}");
})
->loop(5)
->goal('Tell customer exactly when their order will arrive')
->prompt("Where is my order #12345?")
->run();
// Agent automatically:
// Turn 1: get_order_status(12345) → tracking: ABC123
// Turn 2: track_shipment(ABC123) → location: "Memphis, TN"
// Turn 3: estimate_delivery(ABC123) → "Jan 31, 2pm-5pm"
// Returns: "Your order is in Memphis and will arrive tomorrow between 2-5pm"
Example 3: Data Analysis Assistant
Problem: Need insights, not just raw data.
$result = ai()->task()
->tool('query_sales_data', function($params) {
return db()->table('sales')
->where('date', '>=', $params['start_date'])
->where('date', '<=', $params['end_date'])
->all();
})
->tool('calculate_trends', function($params) {
// Calculate month-over-month growth
return analytics()->trends($params['data']);
})
->tool('find_outliers', function($params) {
// Identify unusual spikes or drops
return analytics()->outliers($params['data']);
})
->loop(10)
->goal('Identify what is driving sales growth')
->prompt('Why did sales increase 23% last quarter?')
->run();
// Agent automatically:
// Turn 1: query_sales_data(last_quarter) → 10,000 rows
// Turn 2: calculate_trends() → "23% growth in Q4"
// Turn 3: find_outliers() → "Product X spiked 300%"
// Turn 4: query_sales_data(product_X) → detailed data
// Turn 5: Returns: "Product X drove growth due to viral TikTok video in November"
Example 4: Research Assistant
Problem: Gather and synthesize information from multiple sources.
$result = ai()->task()
->tool('search_web', function($params) {
return http()->get("search-api.com?q={$params['query']}");
})
->tool('scrape_article', function($params) {
return http()->get($params['url'])->body();
})
->tool('summarize_text', function($params) {
return ai()->task()
->prompt("Summarize: {$params['text']}")
->run()['raw'];
})
->loop(15)
->goal('Create comprehensive report on AI trends in 2026')
->prompt('Research AI trends in 2026 and create a report')
->run();
// Agent automatically:
// Turn 1-3: Searches multiple sources
// Turn 4-8: Scrapes and reads articles
// Turn 9-12: Summarizes key findings
// Turn 13-15: Synthesizes into coherent report
Agent Memory
Every turn is stored in memory, allowing the agent to build context:
$result = ai()->task()
->tool('get_data', $fn)
->tool('analyze', $fn)
->loop(5)
->prompt('Analyze user behavior')
->run();
// Access memory
foreach ($result['agent_memory'] as $turn) {
echo "Turn {$turn['turn']}: ";
echo "Role: {$turn['role']}\n";
echo "Content: {$turn['content']}\n";
echo "Tools used: " . implode(', ', $turn['tools_used'] ?? []) . "\n\n";
}
Memory structure:
[
['role' => 'user', 'content' => 'Analyze user behavior', 'turn' => 0],
['role' => 'assistant', 'content' => '...', 'tools_used' => ['get_data'], 'turn' => 1],
['role' => 'assistant', 'content' => '...', 'tools_used' => ['analyze'], 'turn' => 2],
// ...
]
Agent Response Format
Agent mode returns additional fields:
[
'success' => true,
'data' => null, // Or structured data if expect() was used
'raw' => 'The final answer...',
'errors' => [],
// Agent-specific fields
'goal_achieved' => true,
'agent_turns' => 3, // How many turns it took
'agent_memory' => [...], // Full conversation history
'tools_used' => ['search', 'filter', 'compare'],
'tool_results' => ['search' => [...], 'filter' => [...], ...]
]
Best Practices
1. Start with fewer turns, increase if needed:
// Start conservative
->loop(3)
// If agent hits max turns without completing, increase
->loop(7)
2. Use explicit goals for complex tasks:
// Vague (agent might not know when to stop)
->prompt('Research competitors')
// Clear (agent knows the objective)
->goal('Create competitive analysis with pricing, features, and market position')
->prompt('Research our top 3 competitors')
3. Provide focused tools:
// Too broad (agent might get confused)
->tool('do_everything', $fn)
// Focused (agent can reason better)
->tool('search_products', $searchFn)
->tool('filter_results', $filterFn)
->tool('get_details', $detailsFn)
4. Check goal achievement:
if ($result['goal_achieved']) {
// Task completed successfully
log()->info("Agent completed in {$result['agent_turns']} turns");
} else {
// Hit max turns or failed
log()->warning("Agent did not complete goal", $result['errors']);
}
5. Monitor turn usage:
// Log for optimization
log()->info("Agent used {$result['agent_turns']} of {$maxTurns} turns");
// If consistently hitting max, increase limit or simplify task
if ($result['agent_turns'] >= $maxTurns) {
// Consider: more turns, simpler goal, or better tools
}
Common Patterns
Pattern 1: Search → Filter → Select
ai()->task()
->tool('search', $searchFn)
->tool('filter', $filterFn)
->tool('select_best', $selectFn)
->loop(5)
->goal('Find best option')
->run();
Pattern 2: Gather → Analyze → Report
ai()->task()
->tool('fetch_data', $fetchFn)
->tool('analyze_data', $analyzeFn)
->tool('generate_report', $reportFn)
->loop(10)
->goal('Create analysis report')
->run();
Pattern 3: Diagnose → Fix → Verify
ai()->task()
->tool('check_logs', $logsFn)
->tool('trace_error', $traceFn)
->tool('verify_fix', $verifyFn)
->loop(7)
->goal('Find and fix the bug')
->run();
Troubleshooting
Agent hits max turns without completing:
- Increase
loop()number - Simplify the goal
- Add more specific tools
- Check if tools are returning useful data
Agent stops too early:
- Set explicit
goal() - Make goal more specific
- Ensure tools return actionable data
Agent calls wrong tools:
- Improve tool descriptions
- Make parameter schemas clearer
- Reduce number of similar tools
Agent is too slow:
- Reduce
loop()number - Optimize tool execution time
- Cache expensive operations
- Use faster AI model
The framework automatically extracts description() and params() from the class, so you don't need to repeat them when registering the tool.
use Lightpack\AI\Tools\ToolInterface;
class SearchProducts implements ToolInterface
{
public function __invoke(array $params): mixed
{
return db()->table('products')
->where('category', '=', $params['category'])
->where('price', '<=', $params['max_price'])
->all();
}
public static function description(): string
{
return 'Search products by category and price';
}
public static function params(): array
{
return [
'category' => ['string', 'Product category'],
'max_price' => ['number', 'Maximum price']
];
}
}
// Usage - description and params are auto-extracted
$result = ai()->task()
->tool('search', SearchProducts::class)
->prompt('Find laptops under $1000')
->run();
4. Tool Result Format
$result = ai()->task()
->tool('calculate', fn($p) => $p['a'] + $p['b'])
->prompt('What is 5 + 3?')
->run();
[
'success' => true,
'data' => null, // Only set when using expect()
'raw' => 'The answer is 8', // AI's natural language response
'errors' => [],
'tools_used' => ['calculate'], // Which tools were called
'tool_results' => [
'calculate' => 8 // Raw tool output
]
]
Key Points:
- AI calls ONE tool per request (single-shot, not multi-step)
- AI decides which tool to call (or none) based on the question
- Tools receive validated parameters (type-checked and coerced)
- Tool results are passed back to AI to generate natural language answer
- Tools can be closures, invokable objects, or class strings
Vector Search: Architecture & Extensibility
When you call ai()->similar(), Lightpack uses a VectorSearchInterface implementation to find matches. By default, this is InMemoryVectorSearch, but you can swap it for Qdrant, Meilisearch, or any custom implementation. Lightpack's vector search is designed with a simple principle: start simple, scale when needed. The default in-memory implementation is good for most of real-world applications, but you can seamlessly upgrade to vector databases when you outgrow it.
// Default behavior - uses InMemoryVectorSearch automatically
$results = ai()->similar($queryEmbedding, $items);
// Custom implementation - swap to vector database
ai()->setVectorSearch(new QdrantVectorSearch());
$results = ai()->similar($queryEmbedding, 'products_collection');
Default: InMemoryVectorSearch
What it is: A brute-force cosine similarity search that compares your query against every item in memory.
| Aspect | Details |
|---|---|
| Algorithm | Brute-force cosine similarity (O(n)) |
| Accuracy | 100% recall (exact, not approximate) |
| Performance | ~20-250ms for < 5K items |
| Memory | ~3 KB per item (embeddings only) |
| Scale | < 5K documents, < 50 concurrent searches/sec |
Note: Above is not a hard benchmark but a good approximation.
Example:
// Load only embeddings (not full models!)
$items = Product::query()
->select('id', 'embedding')
->whereNotNull('embedding')
->all()
->map(fn($p) => [
'id' => $p->id,
'embedding' => $p->embedding
]);
// Search for similar items
$results = ai()->similar($queryEmbedding, $items, limit: 10);
Extending with Vector Databases
Extend with vector databases for larger scale:
use Lightpack\AI\VectorSearch\VectorSearchInterface;
class QdrantVectorSearch implements VectorSearchInterface
{
public function __construct(private $client) {}
public function search(array $queryEmbedding, mixed $target, int $limit = 5, array $options = []): array
{
// $target is collection name for vector DBs
$response = $this->client->search($target, [
'vector' => $queryEmbedding,
'limit' => $limit,
'score_threshold' => $options['threshold'] ?? 0.0
]);
return $this->formatResults($response);
}
private function formatResults($response): array
{
// Must return same format as InMemoryVectorSearch
return array_map(fn($hit) => [
'id' => $hit['id'],
'similarity' => $hit['score'],
'item' => $hit['payload']
], $response['result']);
}
}
// Use custom implementation
$vectorSearch = app(QdrantVectorSearch::class);
ai()->setVectorSearch($vectorSearch);
// Now similar() uses Qdrant
$results = ai()->similar($queryEmbedding, 'products_collection', limit: 10);
Interface contract:
interface VectorSearchInterface
{
/**
* @param array $queryEmbedding The query vector
* @param mixed $target For in-memory: array of items. For vector DBs: collection name
* @param int $limit Max results to return
* @param array $options Implementation-specific options (threshold, filters, etc.)
* @return array Array of results with 'id', 'similarity', and 'item' keys
*/
public function search(array $queryEmbedding, mixed $target, int $limit = 5, array $options = []): array;
}
Return format (must match):
[
[
'id' => mixed, // Required: Item identifier
'similarity' => float, // Required: Score 0.0-1.0
'item' => array // Required: Item data
],
// ...
]
Caching
Lightpack AI supports provider-level caching (delegated to AI provider) for response optimization.
Enable caching:
ai()->task()
->prompt('Extract email from: john@example.com')
->cache(true) // Enable provider-level caching
->cacheTtl(3600) // Cache duration in seconds (default: 3600)
->run();
How it works:
- Cache parameters (
cache,cache_ttl) are passed to the AI provider - Provider handles caching based on request parameters (model, messages, temperature, etc.)
- Cache key is generated from:
model,messages,temperature,max_tokens,system - Cached responses are returned instantly without API calls
When to cache:
- ✅ Deterministic tasks (
temperature: 0) - ✅ Expensive structured data extraction
- ✅ Repeated identical queries
- ✅ Classification/categorization tasks
When NOT to cache:
- ❌ Creative writing (
temperature > 0.7) - ❌ Real-time data queries
- ❌ Personalized responses
- ❌ Tool-based tasks (tool results may change)
Note: Not all providers support caching. Check provider documentation for details.
Error Handling
Exception-based errors (thrown):
- API connection failures
- Invalid API keys
- Network timeouts
- Provider-specific errors
Validation errors (in result):
- Missing required fields
- Schema type mismatches
- Tool parameter validation failures
- Tool execution errors
Always check success flag:
$result = ai()->task()
->expect(['name' => 'string'])
->required('name')
->prompt('Extract name from: John Doe')
->run();
if ($result['success']) {
echo $result['data']['name'];
} else {
// Handle validation errors
foreach ($result['errors'] as $error) {
logger()->error('AI validation error: ' . $error);
}
}
Tool execution errors:
$result = ai()->task()
->tool('search', function($params) {
throw new \Exception('Database connection failed');
})
->prompt('Search products')
->run();
// $result['success'] = false
// $result['errors'] = ['Tool execution failed: Database connection failed']
Security & Best Practices
API Keys:
- ✅ Store in environment variables or secure config
- ❌ Never commit to version control
- ✅ Use different keys for dev/staging/production
Cost Control:
- Set
maxTokens()to limit response length - Use
temperature(0.0)for deterministic tasks (cheaper) - Enable caching for repeated queries
- Monitor token usage via provider dashboards
Production Readiness:
- Always check
$result['success']before using data - Log AI errors and validation failures
- Set reasonable timeouts (default: 10s)
- Use
required()fields for critical data extraction - Validate tool results before using in application
Tool Security:
- Validate and sanitize tool parameters
- Don't trust AI-generated user IDs - get them from auth/session
- Limit tool access to necessary data only
- Never expose sensitive operations as tools
- Log all tool executions for audit trails
Data Privacy:
- Be aware: AI provider sees all prompts and responses
- Don't send PII unless necessary and compliant
- Consider data retention policies of AI providers
- Use anonymization where possible