How to Run Ollama on VPS India — Host AI Models Without GPU
Back to Blog
VPS Hosting

How to Run Ollama on VPS India — Host AI Models Without GPU

Mayank SharmaMayank Sharma
June 23, 20267 min read
GigaNodes

Blog

VPS Hosting India
VPS Hosting India · June 2026

How to Run Ollama on VPS India — Host AI Models Without GPU

Mayank Sharma·June 2026·8 min read
Direct answer

Run Ollama on a best VPS hosting India plan from GigaNodes starting at ₹800/mo. Cloud XS (4GB RAM) handles Phi-3 Mini. Cloud S (8GB RAM, ₹1,440/mo) runs Llama 3 7B and Mistral 7B without a GPU. Install takes under 5 minutes with one command.

Ollama makes it possible to run large language models like Llama 3, Mistral, and Phi-3 entirely on your own server — no API fees, no rate limits, no data leaving your infrastructure. The catch is that most laptops don’t have the RAM or uptime to run a model 24/7 for an app or API. That’s where a VPS comes in.

This guide covers exactly which Ollama models fit which VPS plan, the install commands, and how to expose Ollama as an API for your own apps.

Why Run Ollama on a VPS Instead of Your Local Machine

Running Ollama locally works for testing, but it stops working the moment you need the model available 24/7, accessible from multiple devices, or callable from a production app. A laptop sleeps, reboots, and loses its IP address. A VPS in India stays online continuously, gets a static IP, and sits close to users in Delhi NCR and across India for low latency.

Running Ollama on a GigaNodes VPS India also means you control your data completely — prompts and outputs never leave your own server, which matters for businesses handling sensitive customer data under India’s data protection rules.

Ollama Model RAM Requirements vs GigaNodes Plans

Model Parameters RAM Needed GigaNodes Plan Price
Phi-3 Mini 3.8B 3-4GB Cloud XS ₹800/mo
Llama 3 7B 7B 6-8GB Cloud S ₹1,440/mo
Mistral 7B 7B 6-8GB Cloud S ₹1,440/mo
Llama 3 13B 13B 14-16GB Cloud M ₹2,880/mo
Llama 3 70B 70B 48-64GB Cloud XL ₹11,520/mo

Step by Step — Installing Ollama on Your VPS

These steps work on any Ubuntu 24.04 VPS, including every GigaNodes plan.

1. Connect to your VPS via SSH

ssh root@your-vps-ip

2. Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

3. Start the Ollama service

systemctl start ollama
systemctl enable ollama

4. Pull a model

ollama pull llama3

5. Run the model

ollama run llama3

6. Optional — add a web UI with Docker

Open WebUI gives you a ChatGPT-style interface for your self-hosted models.

docker run -d -p 3000:8080 –add-host=host.docker.internal:host-gateway ghcr.io/open-webui/open-webui:main

Using the Ollama API

Once running, Ollama exposes a local REST API on port 11434 that any app can call.

curl http://localhost:11434/api/generate -d ‘{
“model”: “llama3”,
“prompt”: “Explain VPS hosting in one sentence”
}’

By default Ollama listens on localhost only. To call the API from another machine, set OLLAMA_HOST=0.0.0.0 in the systemd environment file at /etc/systemd/system/ollama.service.d/override.conf, then restart the service. Put Nginx with SSL in front of it before exposing it publicly.

Performance — AMD EPYC CPU vs GPU

CPU inference on AMD EPYC 7C13 is slower than a dedicated GPU but is fully usable for personal assistants, internal tools, and low-traffic APIs. Expect roughly 5-12 tokens per second on a 7B model on Cloud S — fast enough for chat-style interaction, slower than instant for long generations. For production workloads serving many concurrent users, a GPU-backed setup is recommended; for personal use, side projects, and small team tools, CPU inference on a GigaNodes VPS India is more cost-effective than renting GPU instances.

Why GigaNodes for VPS India

AMD EPYC 7C13 · 141,108 IOPS (2.4x faster than DigitalOcean) · Cloudflare Magic Transit DDoS included free · UPI accepted · GST invoice · Yotta DC Noida · First Indian hosting company with Cloudflare Magic Transit · VPS from ₹400/mo →

Frequently Asked Questions

Can I run Ollama on VPS India without a GPU?
Yes. Ollama runs on CPU-only VPS. On GigaNodes Cloud S (8GB RAM, AMD EPYC 7C13, ₹1,440/mo) Llama 3 7B and Mistral 7B run without any GPU. Response is slower than GPU but works for personal use and API access.
Which Ollama model runs on 8GB RAM VPS?
Llama 3 7B and Mistral 7B both run on 8GB RAM. GigaNodes Cloud S (₹1,440/mo) has 8GB DDR4 RAM on AMD EPYC 7C13 — sufficient for these models with room for the OS and Docker.
What is the best VPS for Ollama in India?
GigaNodes is the best VPS for Ollama in India — AMD EPYC 7C13 with 141,108 IOPS for fast model loading, Cloudflare Magic Transit DDoS protection, and plans from ₹800/mo. Cloud S (8GB RAM, ₹1,440/mo) is the recommended plan for 7B models.
How do I access Ollama from outside my VPS?
By default Ollama listens on localhost only. To access from outside, set OLLAMA_HOST=0.0.0.0 in the systemd service file. Then use Nginx as a reverse proxy with SSL via Certbot for secure public access.

Get the Best VPS Hosting in India

AMD EPYC 7C13 · Cloudflare Magic Transit · Yotta DC Noida · UPI accepted · GST invoice

⭐ 4.9/5 from 500+ reviews · No setup fee · Deploy in 60 seconds

Share this article
Mayank Sharma

Written by

Mayank Sharma

Part of the GigaNodes team, bringing you insights on game hosting and cloud infrastructure.