Aphrodite

App in the BluixApps catalog

What it is

Aphrodite Engine is a vLLM fork by Pygmalion AI that adds advanced sampling methods (top-a, min-p, mirostat, smoothing factor), broader quantization (EXL2, GGUF, AQLM, SqueezeLLM), and KoboldAI API compatibility. Designed for roleplay, creative writing, and exploration scenarios that need finer sampling control than vanilla vLLM provides.

What it's for

Creative writing pipelines — advanced samplers for varied output
Roleplay AI — preserving character voice across long conversations
GGUF / EXL2 quantization support (more than vLLM)
Triple-API compatibility — OpenAI + KoboldAI + native
Karras schedulers — alternative sampling distributions
Mirostat / smoothing — target perplexity sampling

Who it's for

AI roleplay platforms (character.ai-style)
Interactive fiction creators needing varied LLM output
Pygmalion AI community members and their products
Power users wanting more sampler control than vLLM
Researchers exploring novel sampling methods

Why teams pick Aphrodite over alternatives

AGPL-3.0 — fully open
Advanced samplers not available in vanilla vLLM:
- min_p (modern alternative to top_p)
- top_a (probability-shaped truncation)
- tau / Mirostat (perplexity-based)
- smoothing_factor (logit smoothing)
GGUF + EXL2 quantization — broader than vLLM's GPTQ/AWQ
KoboldAI API — drop-in for SillyTavern, KoboldHorde, RisuAI
Pygmalion community models work natively

Integrations

OpenAI v1: /v1/chat/completions, /v1/completions
KoboldAI: /api/v1/generate (for SillyTavern, RisuAI, etc.)
Aphrodite native: /v1/internal/* for advanced samplers
Quantization: GGUF (llama.cpp), EXL2 (ExLlamaV2), AWQ, GPTQ, SqueezeLLM, Bitsandbytes, AQLM
Pair with: SillyTavern (canonical roleplay UI), Pygmalion-tuned models
Multi-GPU: --tensor-parallel-size N

Notable users & community

1.5k+ GitHub stars
PygmalionAI community + commercial Pygmalion service
Used in roleplay AI platforms
Active development by Alpin + contributors
Featured in r/LocalLLaMA roleplay sub-communities

Tips & operations

GGUF for diverse hardware: works on consumer GPUs without modern features
EXL2 for speed: fastest quantization format, ExLlamaV2 lineage
Sampler combos for RP:
- min_p: 0.05, top_a: 0.0, temperature: 0.8, smoothing_factor: 0.3
Mirostat: target perplexity sampling, set mirostat: 1, mirostat_tau: 5
Multi-shard: tensor parallel like vLLM
vs vLLM: same core engine, Aphrodite adds samplers + GGUF/EXL2
vs TGI: Aphrodite for RP/creative, TGI for HF integration

What we ship in BluixApps

Docker (alpindale/aphrodite-engine:latest)
Default model: NousResearch/Meta-Llama-3.1-8B-Instruct (configurable)
Persistent volume: /opt/aphrodite/models (HF cache)
Port 2242 (Aphrodite default)
--launch-kobold-api for SillyTavern/RisuAI compatibility
Install report at /root/bluixapps/aphrodite.txt
Sample API call with advanced samplers (min_p, smoothing_factor)
Quantization format guide
HF_TOKEN environment variable
GPU pre-flight check via bluixapps_ensure_nvidia_runtime
Backup hook covers model cache

Read this app's deep dive on bluix.app ↗

Get this app — pick a BluixApps plan

Same catalog. Scaling tenant isolation, white-label and support tier.

Tier	Tenants	Catalog	Support	White-label	Monthly
Stacks	1	19 curated stacks	Standard	—	$19/mo	Detail Deploy
Starter	10	Full catalog	Standard	+$15–25/mo	$49/mo	Detail Deploy
Pro	25	Full catalog	Priority bugfix	+$15–25/mo	$149/mo	Detail Deploy
Growth	100	Full catalog	Priority bugfix	+$15–25/mo	$349/mo	Detail Deploy
Scale	500	Full catalog	7-day window	+$15–25/mo	$799/mo	Detail Deploy
Enterprise	Unlimited	Full catalog	Priority 7-day	Bundled	$1,499/mo	Detail Deploy

Aphrodite

What it is

What it's for

Who it's for

Why teams pick Aphrodite over alternatives

Integrations

Notable users & community

Tips & operations

What we ship in BluixApps

Get this app — pick a BluixApps plan

BluixApps Stacks — entry tier, single VPS managed

What's included

What's NOT in this tier

Best for

Plan facts

BluixApps Starter — full catalog, up to 10 isolated tenants

What's included

Best for

Where to upgrade from here

Plan facts

BluixApps Pro — 25 isolated tenants, priority bugfix lane

What's included on top of Starter

Best for

Plan facts

BluixApps Growth — 100 tenants, scale-up reseller toolkit

What's included on top of Pro

Best for

Plan facts

BluixApps Scale — 500 tenants, 7-day support window

What's included on top of Growth

Best for

Where to upgrade from here

Plan facts

BluixApps Enterprise — unlimited tenants, white-label bundled

What's included on top of Scale

Best for

Plan facts

Aphrodite

What it is

What it's for

Who it's for

Why teams pick Aphrodite over alternatives

Integrations

Notable users & community

Tips & operations

What we ship in BluixApps

Get this app — pick a BluixApps plan

BluixApps Stacks — entry tier, single VPS managed

What's included

What's NOT in this tier

Best for

Plan facts

BluixApps Starter — full catalog, up to 10 isolated tenants

What's included

Best for

Where to upgrade from here

Plan facts

BluixApps Pro — 25 isolated tenants, priority bugfix lane

What's included on top of Starter

Best for

Plan facts

BluixApps Growth — 100 tenants, scale-up reseller toolkit

What's included on top of Pro

Best for

Plan facts

BluixApps Scale — 500 tenants, 7-day support window

What's included on top of Growth

Best for

Where to upgrade from here

Plan facts

BluixApps Enterprise — unlimited tenants, white-label bundled

What's included on top of Scale

Best for

Plan facts

Generate Password

Generate Password