Architecture Components

Understanding the components that make up Horizons OmniChat is crucial for successful deployment and operation. Let’s explore how each piece works together to create a powerful, flexible chatbot platform.

Core Components: The Building Blocks of Horizons

Horizons consists of three primary components, each carefully designed to handle specific aspects of the platform’s functionality.

Open WebUI: Gateway to AI Interaction

Open WebUI serves as more than just an interface - it’s the command center of your Horizons deployment. Built with modern technologies, it provides a seamless experience for both users and administrators.

At its foundation, Open WebUI combines:

A responsive Svelte frontend that delivers lightning-fast interactions
A robust FastAPI backend handling complex operations
PostgreSQL persistence ensuring no valuable data is lost
Real-time WebSocket communications for instant responses

This powerful combination enables:

Intuitive chat interactions that feel natural and responsive
Comprehensive session management for user continuity
Flexible model selection and configuration options
Detailed chat history and conversation management

Ollama: Local Intelligence Engine

Ollama represents our commitment to providing powerful AI capabilities directly within your infrastructure.

Key capabilities include:

Sophisticated model management for multiple AI models
Optimized inference for maximum performance
Intelligent resource utilization including GPU acceleration
Support for a growing library of models including:
- Llama 2 for general-purpose applications
- Mistral for enhanced reasoning capabilities
- TinyLlama for resource-constrained environments
- Deepseek for specialized applications
- Qwen for multilingual support
- ALIA/Salamandra for Spanish-language excellence
- Custom models for specific use cases

Bedrock Gateway: Bridge to Cloud AI

The Bedrock Gateway exemplifies our approach to hybrid capabilities, providing seamless access to AWS’s powerful AI models while maintaining security and control. This component acts as an intelligent intermediary, handling:

Sophisticated API compatibility and transformation
Intelligent request routing and load distribution
Robust authentication and rate management
Access to premium models including:
- Claude (Anthropic) for advanced reasoning
- Titan (Amazon) for general applications
- Nova (Amazon) for specialized tasks
- Jurassic (AI21) for creative content
- Command (Cohere) for business applications

Component Interactions

Understanding how these components work together is crucial for optimal deployment. Let’s explore the interaction patterns across different deployment modes:

Local Mode: Privacy-First Architecture

graph LR
    User --> WebUI
    WebUI --> Ollama
    Ollama --> LocalModels

In Local mode, components interact within your infrastructure, ensuring complete data privacy while maintaining full functionality. This architecture is perfect for:

Development and testing environments
Privacy-sensitive deployments
Offline operations
Initial evaluation and testing

Hybrid Mode: The Best of Both Worlds

graph LR
    User --> WebUI
    WebUI --> Ollama
    WebUI --> BedrockGateway
    Ollama --> LocalModels
    BedrockGateway --> AWSBedrock

Hybrid mode represents our flexible approach to deployment, combining local processing power with cloud capabilities. This architecture excels in:

Production environments needing both local and cloud models
Scenarios requiring enhanced model variety
Deployments with specific data sovereignty requirements
Cost-optimized production environments

AWS Mode: Enterprise-Scale Architecture

graph LR
    User --> ALB
    ALB --> WebUI-ECS-Fargate
    WebUI-ECS-Fargate --> Ollama-ECS-EC2
    WebUI-ECS-Fargate --> BedrockGateway-ECS-Fargate
    BedrockGateway-ECS-Fargate --> AWSBedrock
    Ollama-ECS-EC2 --> InstalledModels

AWS mode delivers enterprise-grade scalability and reliability. This sophisticated architecture provides:

Automatic scaling capabilities
High availability configurations
Enterprise-grade security
Comprehensive monitoring and management

Understanding Data Flows: How Information Moves Through Horizons

The true power of Horizons lies not just in its components, but in how they work together to process and manage information. Let’s explore the key data flows that make everything work:

Chat Request Flow: From User to AI and Back

When a user interacts with Horizons, a sophisticated sequence of events occurs:

The user’s message begins its journey through WebUI
Our validation layer ensures the request meets all security and format requirements
Smart routing directs the request to either Ollama or Bedrock based on the selected model
The AI model processes the request and generates a response
The interaction is securely stored in PostgreSQL for future reference

This entire process happens in milliseconds, providing a seamless experience while maintaining security and reliability.

Model Management: Keeping AI Updated and Optimized

Our model management flow ensures you always have the right AI models ready when needed:

Administrators can easily select and manage models through the intuitive interface
Ollama handles the secure download and installation of models
Each model is automatically optimized for your specific hardware configuration
Detailed model metadata is maintained for optimal performance and management

Authentication: Keeping Your System Secure

Security is paramount in Horizons, and our authentication flow reflects this:

Every user request goes through robust authentication
Credentials are validated against your security policies
Secure session tokens are generated for ongoing interactions
All subsequent requests are validated using these tokens

This ensures that every interaction is secure while maintaining a smooth user experience.

Scaling for Growth: Adapting to Your Needs

As your usage grows, Horizons grows with you. Our scaling capabilities ensure your system maintains performance under any load:

Local and Hybrid Scaling: Optimizing Local Resources

In Local and Hybrid modes, we focus on maximizing your infrastructure’s potential:

Intelligent vertical scaling of Ollama to leverage available resources
Sophisticated PostgreSQL connection pooling for optimal database performance
Resource-aware model loading and unloading

AWS Mode: Enterprise-Grade Scalability

AWS mode unleashes the full power of cloud scaling:

Automatic ECS scaling based on demand
Intelligent RDS scaling for database operations
Smart request distribution through Application Load Balancers
Efficient model storage using EFS

Keeping Everything Healthy: Monitoring and Maintenance

Maintaining a healthy system requires vigilant monitoring. Horizons provides comprehensive health monitoring capabilities:

Health Checks

Each component provides detailed health information:

WebUI status through /health endpoint
Ollama health via /api/tags
Bedrock Gateway monitoring via /health

Performance Metrics: Understanding The System

We track crucial metrics to ensure optimal performance:

Real-time request latency monitoring
Model inference time tracking
Detailed memory usage analysis
GPU utilization metrics
Database connection management

Securing Your Deployment: A Multi-Layered Approach

Security isn’t just a feature in Horizons - it’s a fundamental aspect of every component:

Component-Level Security

We implement multiple security layers:

Enterprise-grade TLS encryption between all components
Intelligent rate limiting to prevent abuse (ENTERPRISE)
Comprehensive input validation at all entry points (ENTERPRISE)

Your Next Steps

Ready to dive deeper? Here’s where to go next:

Understand our complete Security Architecture
Learn about different Deployment Options
Master Operations for your deployment

Horizons OmniChat by evereven

Horizons: The OmniChat

A flexible and powerful chatbot platform that brings enterprise-grade LLM capabilities to your infrastructure.