Short answer: CheapAI gives you OpenAI-compatible access to Gemini 1.5 Pro and Gemini 1.5 Flash at up to 65% below Google's official per-token pricing. One base URL change in any OpenAI SDK — no Google Cloud account, no API key setup. Native multimodal, 2M token context, delivered within 24 hours, paid with crypto.
Gemini API pricing — CheapAI vs Google official
Official Google AI Studio prices sourced from ai.google.dev/pricing. Checked March 2026. CheapAI prices are fixed per the universal token pack — see pricing page for exact figures.
| Model | Official Input | Official Output | CheapAI Discount |
|---|---|---|---|
| Gemini 1.5 Pro | $1.25/1M | $5.00/1M | Up to 65% off |
| Gemini 1.5 Flash | $0.075/1M | $0.30/1M | Up to 65% off |
Official rates from Google AI Studio (<$128k prompt length) as of March 2026. CheapAI discounts apply via volume aggregation. See exact plan pricing.
Who is this for?
Gemini 1.5 Pro handles text, images, audio, and video natively — no separate vision model needed. Cheaper at scale than o1-preview for image-heavy workloads.
2M token context on Gemini 1.5 Pro — sufficient to process an entire codebase, a full research paper library, or a large customer support knowledge base in one call.
Gemini 1.5 Flash + CheapAI's discount makes classification, summarisation, and extraction tasks viable at a per-token cost lower than almost any other frontier model.
Native Gemini API billing requires a Google Cloud project. With CheapAI you get Gemini access with no GCP setup, no credit card, paid with crypto.
Gemini 1.5 Pro vs Gemini 1.5 Flash — which to use?
Gemini 1.5 Pro — use when:
- Your input requires visual reasoning
- You need 2M+ token context
- Task quality is the primary concern
- You send complex or multi-part instructions
- You need audio/video understanding
Gemini 1.5 Flash — use when:
- Volume is high (>1M calls/day)
- Tasks are simple: classify, summarise, extract
- Latency is more important than depth
- You want the absolute lowest cost per call
- Chatbot or autocomplete-style workloads
For a full comparison with Claude and GPT costs, see the models directory.
When to choose Gemini — and when not to
✓ Choose Gemini when…
- You need multimodal (image/audio/video)
- You need 1M–2M token context
- Flash gives better price/quality for bulk tasks
- You already use OpenAI SDK—just swap URL
✗ Consider Claude or GPT instead when…
- You need Cursor AI or Claude Code integration
- You need best-in-class code generation
- Your task needs Anthropic’s safety tuning
- You rely on function-calling-heavy agentic flows
Quick setup
Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
api_key="your-cheapai-key",
base_url="https://cheapai-netifly-app.up.railway.app"
)
response = client.chat.completions.create(
model="google/gemini-3-pro-preview",
messages=[{"role": "user", "content": "Analyze this document..."}]
)
print(response.choices[0].message.content)
cURL
curl https://cheapai-netifly-app.up.railway.app/chat/completions \
-H "Authorization: Bearer your-cheapai-key" \
-H "Content-Type: application/json" \
-d '{"model": "google/gemini-3-pro-preview", "messages": [{"role": "user", "content": "Hello!"}]}'
Open WebUI
In Open WebUI Settings → Connections → OpenAI-Compatible, set the API Base URL to https://cheapai-netifly-app.up.railway.app and paste your CheapAI key. Select google/gemini-3-pro-preview or google/gemini-3-flash from the model dropdown. See the OpenAI-compatible API guide for full tool setup instructions.
Compatible tools
Limitations & tradeoffs
- Token-pack billing, not per-call: Gemini access is available through universal token packs. Large multimodal inputs (images, audio) consume tokens faster — plan accordingly.
- Not Google Cloud native: This is a proxy endpoint. You do not get Google Cloud SLAs, Data Loss Prevention, or VPC Service Controls.
- Crypto payment only: Bitcoin, Ethereum, USDT, USDC, and others — no credit card or PayPal.
- Not for regulated use cases: Not HIPAA- or SOC 2-certified. Avoid sensitive personal data.
- Delivery within 24 hours: Usually faster, but not instant. SLA activates after blockchain payment confirmation.
Gemini API FAQ
Is this the real Gemini 1.5 Pro? +
Yes. CheapAI proxies your request to the real Google Gemini models. There is no custom or modified version — you get identical outputs to calling the Google AI API directly.
Do I need a Google Cloud account? +
No. CheapAI provides an API key and base URL. No Google account, no GCP project, no billing setup required.
Can I send images to Gemini 1.5 Pro through CheapAI? +
Yes. Gemini 1.5 Pro's multimodal capability is available. Send images as base64 or image URLs using the standard OpenAI vision message format — the proxy passes them through unchanged.
Does streaming work with Gemini? +
Yes. Set stream: true in your request and you will receive server-sent events in real time, identical to the OpenAI streaming format.
How does billing work for Gemini? Per token or flat plan? +
Gemini access through CheapAI uses the universal token pack model. You buy a credit pack and use it across any supported model — including Gemini 1.5 Pro, Gemini 1.5 Flash, Claude, and GPT-4o. See the token pack pricing.
What if my key stops working? +
Every plan is covered by a full service guarantee. If your key stops working during your plan period, contact @cheapai1sell on Telegram and we will replace it immediately or issue a full refund.
What model ID should I use in my code? +
Use google/gemini-3-pro-preview for Gemini 1.5 Pro or google/gemini-3-flash for Gemini 1.5 Flash. See the full list of model IDs in the models directory.
Pricing data sourced from ai.google.dev/pricing. Official rates checked March 2026. Page maintained by the CheapAI team. About CheapAI →
Access Gemini at up to 65% off
Choose a universal token pack. Pay with crypto. Get your key within 24 hours. Full service guarantee.
Get Cheap Gemini API Access