Case study · AI Colosseum

Building AI Colosseum, a multi-AI orchestration platform.

How we built a comprehensive AI orchestration platform that unifies 16 models (9 standard + 7 premium variants), Round Table multi-AI consensus, multi-agent Workflows, anti-detection Humanize, and image vision — all served from a Neon Postgres backend on Google Cloud Run.

AI models

Subscription tiers

Postgres tables

10K+

Active users

Client

AI Colosseum

Industry

AI / SaaS Platform

Services

Full-stack development

The challenge

A fragmented AI landscape.

Users juggle multiple subscriptions, learn different interfaces, and manually compare outputs when they need the best answer. Enterprises need compliance, team features, and cost control — but existing solutions offer none of this.

Pain points identified

What's broken

Multiple AI subscriptions ($100+/month combined)
No way to compare model outputs side-by-side
AI-generated content flagged by detection tools
No collaboration features for teams
Context lost in long conversations

Goals defined

What we built

Single subscription for all major AI models
Multi-AI discussions for better answers
Humanize AI text to bypass detection
Enterprise-ready with team workspaces
Intelligent context management

Our solution

A unified AI orchestration platform.

Designed and built from the ground up — a full-stack SaaS platform that orchestrates multiple AI providers through a unified, intuitive interface.

Multi-Model AI Chat (16 Models)

Unified interface across 9 standard models (ChatGPT, Claude, Gemini, Mistral, Perplexity, Grok, DeepSeek, Llama, Cohere) plus 7 premium variants (GPT-4o, Claude 3.5 Sonnet, Gemini Premium, Mistral Premium, Perplexity sonar-pro, Llama 70B-Turbo, Cohere Command R+). Premium models never silently degrade.

Real-time WebSocket streaming with sticky model header
Image vision support across all providers
Multimodal file uploads — code, docs, images
Automatic failover with provider-specific cooldowns

Round Table + Workflows

Round Table runs models in parallel or sequence with consensus voting; tier-scaled (Pro 3×3, Colosseum 5×5). Workflows orchestrates multi-agent pipelines (sequential, parallel, conditional) with a public marketplace and templates.

Round Table presets: Code Review Panel, Essay Editor, Debate Club
Workflows marketplace with trending algorithms
Dynamic credit calculation by token usage
Team collaboration on shared workflows

Humanize (Anti-Detection)

Production-grade anti-detection: GPTZero scoring + backtranslation + post-processing. Single-pass with Refine Again button. Tier-restricted: 5/month for Starter, unlimited for Pro+.

GPTZero API integration for live AI-detection scoring
Grade-based transformations (A+, A, B+, B, C, D)
4 tone options: Academic, Professional, Casual, Creative
Vision surcharge for image-input transformations

5-Tier Subscription System

Stripe-powered billing across Free (100 credits), Starter ($9.99 / 500), Pro (2,000), Colosseum (10,000), and Enterprise (unlimited). Model-specific credit multipliers, automatic renewal, and team-based access elevation.

5 tiers — Starter $9.99 added April 2026 below Pro
Team-based model access elevation (members inherit owner perks)
Credit pooling across team members
Token-based team invitations with status tracking

Technical architecture

Stack and infrastructure.

Frontend stack

Next.js 15.5 with App Router & Turbopack
React 19.1 with concurrent features
TypeScript strict mode
Tailwind CSS 3.4
React Virtuoso 4 for 10K+ message virtualization
Sticky model header, view modes (Full / Compact / Tabbed)

Backend stack

FastAPI (Python) with async/await
WebSocket for real-time streaming
Neon Postgres via asyncpg (migrated from Firestore April 2026)
Google Cloud Run for serverless hosting
Stripe for 5-tier subscriptions and webhook ingestion
GPTZero API for live AI-detection scoring

Database migration · 2026-04

Firestore → Neon Postgres.

In April 2026 we completed a full migration from Firestore to Neon Postgres with asyncpg. The 22-table relational schema gives stronger ACID guarantees, cleaner team / credit / subscription joins, and serverless pricing aligned with our Cloud Run + Vercel deployment topology. In-memory conversation caching was removed in favor of direct DB queries to avoid staleness across multi-instance deployments.

Technical highlights

Built for scale.

Responsive design

Fully responsive interface that works on desktop, tablet, and mobile.

Security first

Strict CSP, JWT auth, Google OAuth, and encrypted data at rest and in transit.

Performance optimized

Turbopack builds, virtualized lists for 10K+ messages, optimistic UI updates.

Built to scale

Serverless Cloud Run auto-scales to demand, Postgres handles millions of rows.

The results

Real impact, both sides.

For users

Single subscription replaces 5+ AI services
60% faster research with multi-AI collaboration
95%+ success rate bypassing AI detection
Unlimited conversation history & organization

For business

Recurring SaaS revenue model
Enterprise pipeline for large deals
Low operational costs with serverless
Scalable architecture handles 10x growth

Built with

Technology stack.

Next.js 15.5React 19TypeScriptTailwind 3.4FastAPIPythonGoogle Cloud RunNeon PostgresasyncpgStripeGPTZeroWebSocket Streaming

Try it live

Want to try AI Colosseum?

Compare 16 AI models in one platform with Round Table consensus, multi-agent Workflows, and Humanize. Or contact us to discuss building a similar platform for your business.

Visit AI Colosseum Start a project