Overview
Swiss Army Llama is a comprehensive FastAPI service that brings together state-of-the-art language models and embedding technologies for semantic text search. It's designed for production environments with a focus on reliability, scalability, and ease of use.
Key Features
- Multiple Embedding Models - Support for various embedding models including sentence-transformers, OpenAI, and custom models
- Semantic Search - Advanced similarity search across document collections
- Vector Database Integration - Works with popular vector stores like Pinecone, Weaviate, and Qdrant
- Production Ready - Built-in rate limiting, caching, and monitoring
- FastAPI Framework - Modern async Python with automatic OpenAPI documentation
Use Cases
- Document Search - Search through large document collections with natural language queries
- Recommendation Systems - Build content-based recommendations using semantic similarity
- Question Answering - Create RAG (Retrieval Augmented Generation) pipelines
- Duplicate Detection - Find semantically similar content across your data
Getting Started
# Clone the repository
git clone https://github.com/Dicklesworthstone/swiss_army_llama.git
# Install dependencies
pip install -r requirements.txt
# Run the server
uvicorn main:app --reloadWhy Use Swiss Army Llama?
Unlike simple embedding services, Swiss Army Llama provides a complete toolkit for building semantic search applications. It handles the complexity of managing multiple models, caching results, and scaling to production workloads.