0%
PRAXIUM LABS

Namaste! 🇳🇵

You found our hidden gem! Something incredible is brewing in the heart of the Himalayas. We might have something special here for you soon.

Stay curious. Jay Nepal!

Share

Building AI Chatbots for Nepali Customer Support (2026 Engineering Guide)

Building AI Chatbots for Nepali Customer Support (2026 Engineering Guide)

TL;DR. A production AI chatbot for a Nepali business needs three things: bilingual handling (Devanagari + Romanised Nepali + English code-switching), RAG over your own knowledge base, and human-handoff for anything it cannot confidently answer. Build cost runs NPR 100,000–250,000; ongoing API costs are NPR 5,000–25,000/month for SME volumes. The wins: 50–70% of inbound support tickets answered without human involvement, response times under 5 seconds 24/7.

This is the Praxium Labs view from real engagements with Nepali businesses on the ground. Customer support is the most predictable AI use case for Nepali SMEs — high volume, repetitive questions, available training data. The hard part is not the AI; it is making the bot bilingual, accurate against your real product data, and graceful when it does not know.

The "Nepali chatbot" requirement, decoded

Real Nepali support conversations are not pure Nepali. They are Devanagari, Romanised Nepali ("Kya kura cha hajur"), English, and a mid-sentence mix of all three. Any chatbot that only handles one of these fails 30 seconds into the first conversation. The bot needs to (a) detect the user's preferred form, (b) reply in matching form, (c) ideally maintain that preference across the conversation.

The architecture we ship

Six components. The order matters.

  • Channel adapter: WhatsApp, web widget, Telegram, Facebook Messenger — all normalised into a common event format
  • Language detector: identifies Devanagari / Romanised Nepali / English from the message
  • RAG retriever: embeds the user message, finds the top 5 relevant chunks from your knowledge base (product docs, FAQ, policy)
  • LLM call: system prompt + retrieved context + conversation history → response (Claude or GPT-4 class model)
  • Safety layer: filters for hallucination, off-topic, prompt injection
  • Handoff layer: if confidence is low, escalate to a human agent (in-app, WhatsApp, email)

Why RAG (not fine-tuning)

Fine-tuning a model on your data takes weeks and gets stale immediately. RAG (Retrieval-Augmented Generation) embeds your knowledge base once, then retrieves and injects the most relevant chunks into every conversation. New product? Add a document. Updated policy? Re-embed one file. We have not fine-tuned a customer chatbot model in three years; RAG is good enough for 99% of Nepali support use cases. See our RAG implementation guide.

Which model: GPT-4o, Claude, Gemini, or local?

For Nepali language quality: Claude (3.5 Sonnet and later) and GPT-4o are roughly tied for Devanagari fluency and the best on Romanised Nepali. Gemini Pro lags slightly on code-switched text. Open-source models (Llama 3.1 70B, Qwen 2.5) are usable but require either expensive GPU hosting or a hosted provider (Together, Groq). For most Nepali SMEs we default to Claude or GPT-4o on cost-per-conversation grounds. Full cost breakdown: ChatGPT API pricing in NPR.

Safety and hallucination control

AI chatbots fail safely or fail dangerously. The pattern that fails safely:

  • Grounding: system prompt instructs the model to answer only from retrieved context; if context is silent, say "I don't know — let me get a human"
  • Confidence threshold: below a similarity score, escalate to human (do not guess)
  • Forbidden topics: never quote prices, refund amounts, or policy without a verified source
  • Prompt-injection defence: strip / escape user input that looks like instructions ("ignore previous prompt...")

Channels for Nepal

WhatsApp drives 70–80% of customer chatbot traffic in our deployments — it is where Nepalis actually message businesses. See our WhatsApp Business setup guide. In-page widget (your website) is a distant second, mostly for first-time visitors. Telegram matters for tech-forward niches (crypto, gaming). Viber is fading but still relevant for older customer bases.

Costs to budget (NPR)

  • Build (Praxium Labs): NPR 100,000 starter / 250,000 advanced / 500,000+ enterprise
  • LLM API: NPR 5,000–25,000 / month for SME volumes (~5,000–30,000 conversations)
  • Vector database (Pinecone / Qdrant / pgvector): NPR 0–5,000 / month
  • Hosting (VPS for orchestrator): NPR 1,500–3,000 / month
  • WhatsApp messaging: NPR 1.4–8.5 per conversation (see pricing detail)

Frequently asked questions

Can a chatbot really handle 70% of Nepali support inquiries?

For most product-focused businesses: yes, if your knowledge base is good and you set up RAG correctly. Categories where we consistently hit 60–80% deflection: e-commerce (order status, returns, sizing), edtech (course questions, schedule), fintech (account questions, transaction help). Categories where deflection is harder: legal advice, complex troubleshooting, anything emotional (complaints).

Does the bot reply in pure Devanagari or Romanised Nepali?

It mirrors the user. If the user writes in Devanagari, the bot replies in Devanagari. Romanised Nepali user gets Romanised Nepali back. Code-switched users (Nepali + English mid-sentence) get code-switched responses. The system prompt explicitly instructs the model to maintain user-preferred form.

How long does a build take?

A focused MVP (single channel, single knowledge base, ~50 FAQ topics): 2–3 weeks. Production-grade across WhatsApp + web with handoff and analytics: 6–8 weeks. We always start small and ship; expansion happens after launch.

What's the failure mode I should worry about?

Hallucination in confident-sounding language. A bot saying "Refunds processed in 3 days" when policy is 7 days creates churn and trust problems. Mitigation: never let the bot make up specifics — every numeric answer must come from a retrieved source, otherwise escalate to human.

Can the bot integrate with our CRM (Zoho, Bitrix24, HubSpot)?

Yes — n8n is the integration layer between the bot and the CRM. Bot extracts intent ("customer wants to return order #12345"), n8n looks up the order in your ERP, replies with the return policy, and creates a CRM activity automatically.

What about data privacy?

Anthropic and OpenAI do not train on API traffic by default (only on opt-in plans). For sensitive sectors (banking, healthcare), use the enterprise tier or self-host an open-source model. We outline the threat model in our FinTech compliance post.

Who can build this in Nepal?

Praxium Labs — Nepal's AI and automation consultancy, based in Lalitpur — designs and builds the systems described in this guide for Nepali businesses and for international teams hiring from Nepal. Start a project or see all services.