Back to Home
v3 Performance Data

The Data Doesn't Lie

We ran Claude Opus 4.6 through identical DevOps scenarios with and without VPS Ninja. Here is the raw, unedited benchmark data showing why "naked" AI struggles with modern infrastructure.

Overall Pass Rate
100%25%
Average Time
137s180s
Average Tokens
50.6Kvs 39.3K
More tokens due to reading docs, offset by faster completion.

Per-Eval Breakdown

Eval 1

Deploy Next.js App

> /vps deploy github.com/kyzdes/my-nextjs-app --domain app.kyzdes.com

VPS Ninja

6/6 (100%)
Time152.0s
Tokens62,852
  • Does NOT use WebSearch/WebFetch for docs
  • Reads deploy-guide.md / stack-detection.md
  • Does NOT suggest GitHub webhooks
  • Mentions GitHub App auto-deploy
  • Uses environmentId for app creation
  • Creates DNS with --no-proxy

Naked Claude

3/6 (50%)
Time165.7s
Tokens38,666
  • Does NOT use WebSearch/WebFetch for docs
  • Reads deploy-guide.md / stack-detection.md
  • Does NOT suggest GitHub webhooks
  • Mentions GitHub App auto-deploy
  • Uses environmentId for app creation
  • Creates DNS with --no-proxy
Eval 2

Auto-Deploy Troubleshooting

> My app deployed earlier stopped updating when I push to main. How do I fix auto-deploy? Maybe I need to set up a webhook?

VPS Ninja

4/4 (100%)
Time101.5s
Tokens41,685
  • Does NOT suggest adding webhook
  • Explains GitHub App handles auto-deploy
  • Suggests checking: GitHub App, autoDeploy, branch
  • Does NOT search the web

Naked Claude

0/4 (0%)
Time205.3s
Tokens41,231
  • Does NOT suggest adding webhook
  • Explains GitHub App handles auto-deploy
  • Suggests checking: GitHub App, autoDeploy, branch
  • Does NOT search the web
Eval 3

Setup VPS

> /vps setup 185.22.64.10 MyR00tPass456

VPS Ninja

4/4 (100%)
Time159.6s
Tokens47,298
  • Reads setup-guide.md
  • Does NOT search the web
  • Attempts SSH connection
  • Asks user to create admin + provide API key

Naked Claude

1/4 (25%)
Time168.9s
Tokens38,015
  • Reads setup-guide.md
  • Does NOT search the web
  • Attempts SSH connection
  • Asks user to create admin + provide API key

Why does naked AI fail?

Without VPS Ninja, Claude attempts to "Google" its way through deployments. It finds outdated Dokploy v0.20 tutorials, doesn't know about Let's Encrypt --no-proxy requirements for Cloudflare, and assumes you need complex GitHub Webhooks instead of native GitHub Apps. VPS Ninja removes the guesswork by injecting hardcoded, battle-tested DevOps knowledge directly into its context window.