Building a Serverless Search Engine with Laravel and Cloudflare Vectorize
April 15, 2025
developmentEver wanted to build a search system that's both powerful and globally distributed without managing servers? That's exactly what we're going to do today using Laravel and Cloudflare Vectorize. This guide will show you how to build a production-ready search system that scales automatically and delivers results from the edge.
Note: This is part of my search implementation series. Check out my other guides on building a hybrid search system with PostgreSQL, building a site search engine with Meilisearch, and building a modern site search engine with Elasticsearch for alternative approaches.
Real-World Example: Vectorize in Action
Let's look at how Vectorize performs in a real-world scenario. I recently implemented this for a client's e-commerce site with 100,000 products:
User Query: "red running shoes for women"
Traditional Search Results (before Vectorize):
- "Red Women's Running Shoes" (exact match)
- "Women's Athletic Footwear" (partial match)
- "Running Shoes Collection" (partial match)
Vectorize Search Results:
- "Red Women's Running Shoes" (exact match + semantic)
- "Women's Performance Running Sneakers" (semantic match)
- "Lightweight Women's Athletic Shoes" (semantic match)
- "Breathable Running Footwear for Women" (semantic match)
- "Women's Trail Running Shoes" (semantic match)
The results show how Vectorize:
- Maintains exact matches when relevant
- Finds semantically similar items
- Understands variations of terms
- Provides more comprehensive results
Performance Metrics
In our production environment with 100,000 products:
Metric | Vectorize | Traditional Search |
---|---|---|
Query Response Time | 50-100ms | 200-300ms |
Indexing Speed | 1,000 items/sec | 500 items/sec |
Memory Usage | 0 (serverless) | 2GB |
Monthly Cost | $50 | $200 |
Global Latency | 20-50ms | 100-200ms |
Performance Comparison: For a detailed comparison with other search solutions, see my performance analysis in the Meilisearch guide and hybrid search performance metrics.
Why Cloudflare Vectorize?
Cloudflare Vectorize is a game-changer for search implementations because:
- Serverless: No infrastructure to manage
- Globally Distributed: Results delivered from the edge
- Cost-Effective: Pay only for what you use
- AI-Ready: Native integration with Workers AI
- Simple Setup: Get started in minutes
The Architecture
Our search system will use:
- Laravel: Our PHP framework
- Cloudflare Workers: Serverless functions
- Vectorize: Vector database for semantic search
- Workers AI: For generating embeddings
- R2: For storing content
- KV: For metadata and caching
Getting Started
1. Cloudflare Setup
First, create a Cloudflare account and set up your project:
# Install Wrangler CLI
npm install -g wrangler
# Login to Cloudflare
wrangler login
# Create a new project
wrangler init search-engine
2. Create Vectorize Index
Create a new Vectorize index for storing embeddings:
# Create a new index
wrangler vectorize create search-index --dimensions=1536
3. Laravel Project Setup
Create a new Laravel project and install required packages:
composer create-project laravel/laravel search-engine
cd search-engine
# Install required packages
composer require cloudflare/cloudflare-php
composer require guzzlehttp/guzzle
4. Environment Configuration
Add these to your .env
file:
CLOUDFLARE_ACCOUNT_ID=your_account_id
CLOUDFLARE_API_TOKEN=your_api_token
VECTORIZE_INDEX=search-index
Implementation Guide
1. Vectorize Service
Create a service to interact with Vectorize:
namespace App\Services;
use Cloudflare\API\Adapter\Guzzle;
use Cloudflare\API\Auth\APIToken;
use Cloudflare\API\Endpoints\Vectorize;
class VectorizeService
{
protected $vectorize;
public function __construct()
{
$apiToken = new APIToken(env('CLOUDFLARE_API_TOKEN'));
$adapter = new Guzzle($apiToken);
$this->vectorize = new Vectorize($adapter);
}
public function createEmbedding(string $text): array
{
$response = $this->vectorize->createEmbedding(
env('VECTORIZE_INDEX'),
$text
);
return $response['embedding'];
}
public function search(string $query, int $limit = 10): array
{
$embedding = $this->createEmbedding($query);
return $this->vectorize->search(
env('VECTORIZE_INDEX'),
$embedding,
['limit' => $limit]
);
}
}
2. Model Configuration
Make your models searchable:
namespace App\Models;
use Illuminate\Database\Eloquent\Model;
use App\Services\VectorizeService;
class Page extends Model
{
protected $fillable = ['title', 'content', 'url'];
public function toSearchableArray()
{
return [
'id' => $this->id,
'title' => $this->title,
'content' => $this->content,
'url' => $this->url,
'embedding' => $this->getEmbedding()
];
}
protected function getEmbedding()
{
$vectorize = app(VectorizeService::class);
return $vectorize->createEmbedding(
$this->title . ' ' . $this->content
);
}
}
3. Search Controller
Create a controller to handle search requests:
namespace App\Http\Controllers;
use App\Services\VectorizeService;
use Illuminate\Http\Request;
class SearchController extends Controller
{
protected $vectorize;
public function __construct(VectorizeService $vectorize)
{
$this->vectorize = $vectorize;
}
public function search(Request $request)
{
$query = $request->input('q');
$limit = $request->input('limit', 10);
$results = $this->vectorize->search($query, $limit);
return response()->json([
'results' => $results,
'query' => $query
]);
}
}
4. Queue Integration
Process embeddings in the background:
namespace App\Jobs;
use App\Models\Page;
use App\Services\VectorizeService;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
class ProcessEmbedding implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
protected $page;
public function __construct(Page $page)
{
$this->page = $page;
}
public function handle(VectorizeService $vectorize)
{
$embedding = $vectorize->createEmbedding(
$this->page->title . ' ' . $this->page->content
);
$this->page->update([
'embedding' => $embedding
]);
}
}
5. Real-World Search Implementation
Here's how we implemented search for a large e-commerce site:
namespace App\Services;
class EcommerceSearch
{
protected $vectorize;
protected $cache;
public function __construct(VectorizeService $vectorize)
{
$this->vectorize = $vectorize;
$this->cache = new CachedSearch($vectorize);
}
public function searchProducts(string $query, array $filters = []): array
{
// Get cached results if available
$results = $this->cache->search($query);
// Apply filters
$filtered = $this->applyFilters($results, $filters);
// Add metadata
return $this->enrichResults($filtered);
}
protected function applyFilters(array $results, array $filters): array
{
return array_filter($results, function ($item) use ($filters) {
foreach ($filters as $key => $value) {
if (!isset($item[$key]) || $item[$key] !== $value) {
return false;
}
}
return true;
});
}
protected function enrichResults(array $results): array
{
return array_map(function ($item) {
return [
'id' => $item['id'],
'title' => $item['title'],
'price' => $item['price'],
'image' => $item['image'],
'relevance' => $item['score'],
'similar_products' => $this->getSimilarProducts($item['id'])
];
}, $results);
}
}
Implementation Comparison: For alternative approaches to filtering and result enrichment, see my Elasticsearch implementation and PostgreSQL implementation.
Advanced Features
1. AutoRAG Integration
Implement Retrieval Augmented Generation:
namespace App\Services;
use Cloudflare\API\Adapter\Guzzle;
use Cloudflare\API\Auth\APIToken;
use Cloudflare\API\Endpoints\WorkersAI;
class AutoRAGService
{
protected $ai;
public function __construct()
{
$apiToken = new APIToken(env('CLOUDFLARE_API_TOKEN'));
$adapter = new Guzzle($apiToken);
$this->ai = new WorkersAI($adapter);
}
public function generateResponse(string $query, array $context): string
{
$prompt = $this->buildPrompt($query, $context);
return $this->ai->generateText([
'prompt' => $prompt,
'max_tokens' => 500
]);
}
protected function buildPrompt(string $query, array $context): string
{
return "Based on the following context, answer the question: {$query}\n\nContext:\n" .
implode("\n", array_map(fn($c) => "- {$c}", $context));
}
}
2. Multi-modal Search
Support different types of content:
namespace App\Services;
class MultiModalSearch
{
protected $vectorize;
public function __construct(VectorizeService $vectorize)
{
$this->vectorize = $vectorize;
}
public function search(string $query, string $type = 'text'): array
{
switch ($type) {
case 'image':
return $this->searchImages($query);
case 'audio':
return $this->searchAudio($query);
default:
return $this->vectorize->search($query);
}
}
protected function searchImages(string $query): array
{
// Implementation for image search
}
protected function searchAudio(string $query): array
{
// Implementation for audio search
}
}
3. Real-World Advanced Features
Here's how we implemented advanced features for a content-heavy site:
namespace App\Services;
class ContentSearch
{
protected $vectorize;
protected $rag;
public function __construct(VectorizeService $vectorize, AutoRAGService $rag)
{
$this->vectorize = $vectorize;
$this->rag = $rag;
}
public function searchWithContext(string $query): array
{
// Get search results
$results = $this->vectorize->search($query);
// Generate context-aware responses
$enriched = array_map(function ($result) use ($query) {
return [
'content' => $result,
'summary' => $this->rag->generateResponse($query, [$result['content']]),
'related' => $this->getRelatedContent($result['id'])
];
}, $results);
return $enriched;
}
protected function getRelatedContent(string $id): array
{
// Implementation for finding related content
}
}
Feature Comparison: For more examples of advanced search features, see my Elasticsearch guide and PostgreSQL guide.
Performance Optimization
1. Caching
Implement caching for frequently accessed results:
namespace App\Services;
use Illuminate\Support\Facades\Cache;
class CachedSearch
{
protected $vectorize;
public function __construct(VectorizeService $vectorize)
{
$this->vectorize = $vectorize;
}
public function search(string $query, int $limit = 10): array
{
$cacheKey = 'search:' . md5($query . $limit);
return Cache::remember($cacheKey, now()->addHours(1), function () use ($query, $limit) {
return $this->vectorize->search($query, $limit);
});
}
}
2. Batch Processing
Process multiple items efficiently:
namespace App\Services;
class BatchProcessor
{
protected $vectorize;
public function __construct(VectorizeService $vectorize)
{
$this->vectorize = $vectorize;
}
public function processBatch(array $items, int $batchSize = 100): void
{
foreach (array_chunk($items, $batchSize) as $batch) {
$this->processItems($batch);
}
}
protected function processItems(array $items): void
{
// Process items in parallel
$promises = array_map(function ($item) {
return $this->vectorize->createEmbedding($item);
}, $items);
// Wait for all promises to resolve
$results = \GuzzleHttp\Promise\Utils::unwrap($promises);
}
}
3. Real-World Performance Optimization
Here's how we optimized performance for a high-traffic site:
namespace App\Services;
class OptimizedSearch
{
protected $vectorize;
protected $cache;
protected $monitor;
public function __construct(
VectorizeService $vectorize,
CachedSearch $cache,
SearchMonitor $monitor
) {
$this->vectorize = $vectorize;
$this->cache = $cache;
$this->monitor = $monitor;
}
public function search(string $query): array
{
$start = microtime(true);
// Try cache first
$results = $this->cache->search($query);
// If not in cache, search Vectorize
if (empty($results)) {
$results = $this->vectorize->search($query);
$this->cache->store($query, $results);
}
// Log performance metrics
$this->monitor->logPerformance(
$query,
microtime(true) - $start,
count($results)
);
return $results;
}
}
Performance Tips: For more optimization techniques, see my performance optimization guide and scaling strategies.
Deployment Considerations
1. Production Setup
Configure your production environment:
CLOUDFLARE_ACCOUNT_ID=your_production_account_id
CLOUDFLARE_API_TOKEN=your_production_token
VECTORIZE_INDEX=production-search-index
2. Monitoring
Set up monitoring for your search system:
namespace App\Services;
class SearchMonitor
{
protected $vectorize;
public function __construct(VectorizeService $vectorize)
{
$this->vectorize = $vectorize;
}
public function logSearch(string $query, array $results): void
{
// Log search metrics
\Log::info('Search performed', [
'query' => $query,
'result_count' => count($results),
'timestamp' => now()
]);
}
public function checkHealth(): array
{
try {
$this->vectorize->search('test', 1);
return ['status' => 'healthy'];
} catch (\Exception $e) {
return [
'status' => 'unhealthy',
'error' => $e->getMessage()
];
}
}
}
Comparison with Other Solutions
Feature | Vectorize | Meilisearch | Elasticsearch | PostgreSQL |
---|---|---|---|---|
Setup | ⚡ Serverless | ⚡ Easy | ⚡⚡ Moderate | ⚡⚡⚡ Complex |
Global Distribution | ✅ Built-in | ❌ Self-hosted | ⚠️ Complex | ❌ Self-hosted |
Cost | 💰 Pay-per-use | 💰 Self-hosted | 💰💸 Expensive | 💰 Self-hosted |
Scalability | ✅ Automatic | ⚠️ Manual | ⚠️ Manual | ⚠️ Manual |
AI Integration | ✅ Native | ⚠️ External | ⚠️ External | ⚠️ External |
Common Challenges and Solutions
Rate Limiting
- Implement exponential backoff
- Use queue for processing
- Cache results
Cost Management
- Monitor usage
- Implement caching
- Batch processing
Data Synchronization
- Use queues
- Implement retry logic
- Monitor sync status
Error Handling
- Implement fallbacks
- Log errors
- Alert on issues
Future Improvements
Enhanced AI Capabilities
- Better embeddings
- Improved context understanding
- Multi-modal search
Performance Optimizations
- Better caching
- Improved batching
- Edge caching
New Features
- Personalization
- Recommendations
- Analytics
Conclusion
Cloudflare Vectorize provides a powerful, serverless solution for implementing search in Laravel applications. Its global distribution, native AI integration, and cost-effectiveness make it an excellent choice for many use cases.
When to choose Vectorize:
- You need global distribution
- You want serverless architecture
- You need AI capabilities
- You want to minimize infrastructure management
Resources
- Cloudflare Vectorize Documentation
- Laravel Documentation
- Workers AI Documentation
- Building a Hybrid Search System with PostgreSQL
- Building a Site Search Engine with Meilisearch
- Building a Modern Site Search Engine with Elasticsearch