The Open Source
Hub for AI Models
World-class, self-hosted model repository designed for vLLM and SGLang. Secure your AI assets with RBAC, accelerate distribution with proxy caching, and manage TB-scale models effortlessly.
MatrixHub is to Hugging Face what Harbor is to Docker Hub.
Stop relying on public internet for mission-critical AI. Control your assets, accelerate your pipelines.
Core Features
Infrastructure designed for Scale
Built for the specific needs of SREs and Algorithm Engineers managing massive model weights.
HF Transparent Proxy
Drop-in replacement. Point your HF_ENDPOINT to MatrixHub and keep your existing training/inference code unchanged.
Intranet Cache
Pull once, cache forever. Drastically reduce bandwidth costs and accelerate cluster-wide model distribution.
Enterprise RBAC
Fine-grained permissions, audit logs, and multi-tenant isolation for security-conscious organizations.
TB-Scale Transfer
Optimized for massive files with resumable uploads, chunking, and P2P capabilities for stability.
Key Use Cases
How organizations use MatrixHub in production.
High-Speed Cluster Distribution
When deploying a 70B model to 100 GPUs, MatrixHub acts as a local pull-through cache, preventing internet bottleneck.
Air-Gapped Environments
Securely ferry models from public internet to isolated high-security networks with strict audit trails and scanning.
Asset & Version Management
Centralize your fine-tuned checkpoints (LoRA/Full) with immutable version tags, treating models as production artifacts.
Cross-Region Sync
Automatically synchronize model registries across different geographic data centers for low-latency inference.
SEAMLESSLY INTEGRATED WITH
vLLM
SGLang
Kubernetes
MinIOReady to take control of your AI Models?
Deploy MatrixHub in minutes using Docker Compose or Helm. Open source and free for the community.