Building a Local Chatbot with Typescript, Express, Langchain, and Ollama (Qwen)
In the world of AI-powered applications, large language models (LLMs) like OpenAI’s GPT models get most of the attention. However, running your own model locally offers several advantages: no API costs, no rate limits, and complete control over your data. In this article, we'll walk through building a local chatbot API in Typescript using Express and Langchain, with Ollama serving a Qwen3 model.
The result will be a flexible, local-first chatbot you can integrate into any frontend or automation pipeline.
Section 1: Installing Dependencies and Initial Project Setup
We start by creating our Node.js project and installing the required dependencies.
Make sure you have the latest lts
version of node
and yarn
installed. I use nvm
to manage this.
nvm install lts/*
npm install -g strip-json-comments-cli
npm install -g yarn
Now, create the directory for the chatbot and run yarn init
mkdir chatbot-express-example
cd chatbot-express-example
node --version > .nvmrc
yarn init -y
This will give you a package.json like:
{
"name": "chatbot-express-example",
"version": "0.1.0",
"main": "index.js",
"author": "Mark C Allen <mark@markcallen.com>",
"license": "MIT"
}
Now setup express and typescript.
yarn add express
yarn add -D typescript @types/node @types/express tsx rimraf
yarn add langchain @langchain/community
And configure the tsconfig.json
npx tsc --init --rootDir src --outDir dist \
--esModuleInterop --target es2020 --module commonjs \
--verbatimModuleSyntax false --allowJs true --noImplicitAny true
# clean up tsconfig
cat tsconfig.json \
| strip-json-comments --no-whitespace \
| jq -r . > tsconfig.pretty.json && mv tsconfig.pretty.json tsconfig.json
Here’s what’s happening:
- express — Minimal web framework for building the API.
- typescript, tsx, @types/node, @types/express — TypeScript support and types.
- rimraf — Utility to clear the build folder.
- langchain & @langchain/community — Abstractions for prompt management, model calls, and streaming with Ollama.
Section 2: Writing the Express API with LangChain
We now create our chatbot server at src/index.ts
:
import express from "express"
import type { Request, Response } from "express";
import { ChatPromptTemplate, MessagesPlaceholder } from "@langchain/core/prompts";
import { RunnableSequence } from "@langchain/core/runnables";
import { ChatOllama } from "@langchain/community/chat_models/ollama";
import { HumanMessage, SystemMessage } from "@langchain/core/messages";
const app = express();
app.use(express.json());
const OLLAMA_URL = process.env.OLLAMA_URL ?? "http://localhost:11434";
const MODEL = process.env.OLLAMA_MODEL ?? "qwen3:0.6b";
const llm = new ChatOllama({
baseUrl: OLLAMA_URL,
model: MODEL,
temperature: 0.7,
});
const system = new SystemMessage(
"You are everyday devops bot, a concise DevOps assistant. Answer directly, with examples when useful."
);
const prompt = ChatPromptTemplate.fromMessages([
system,
new MessagesPlaceholder("messages"),
]);
const chain = RunnableSequence.from([
prompt,
llm,
]);
app.post("/chat", async (req: Request, res: Response) => {
const { question } = req.body ?? {};
if (!question) return res.status(400).json({ error: "Missing 'question'." });
const aiMsg = await chain.invoke({
messages: [new HumanMessage(question)],
});
const content =
typeof aiMsg.content === "string"
? aiMsg.content
: aiMsg.content.map((c: any) => c?.text ?? "").join("");
res.json({ answer: content });
});
app.post("/chat/stream", async (req: Request, res: Response) => {
const { question } = req.body ?? {};
if (!question) return res.status(400).json({ error: "Missing 'question'." });
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
const stream = await chain.stream({
messages: [new HumanMessage(question)],
});
for await (const chunk of stream) {
const piece =
typeof chunk.content === "string"
? chunk.content
: Array.isArray(chunk.content)
? chunk.content.map((c: any) => c?.text ?? "").join("")
: "";
if (piece) res.write(`data: ${JSON.stringify(piece)}\n\n`);
}
res.end();
});
const port = Number(process.env.PORT ?? 3000);
app.listen(port, () => {
console.log(`API listening on :${port}`);
});
What’s happening here:
- SystemMessage defines the assistant’s tone and scope.
- ChatPromptTemplate structures the input for the model.
- ChatOllama connects LangChain to the local Ollama server.
- /chat handles single-response interactions.
- /chat/stream handles streamed responses
To build and run locally, add scripts for dev, build and start
npm pkg set "scripts.build"="rimraf ./dist && tsc"
npm pkg set "scripts.start"="node dist/index.js"
Section 3: Ollama
It's best to run ollama on the OS directory to utilize the GPUs. See Supercharge Your Local AI for more details on how to do this.
Start the server
ollama serve
Pull and run the model
ollama pull qwen3:0.6b
ollama run qwen3:0.6b
Section 4: Building and Running
To build and start:
yarn build
yarn start
Test the chatbot:
curl -sS -X POST http://localhost:3000/chat -H "Content-Type: application/json" -d '{"question":"Give me one CI/CD best practice."}'
Streaming test:
curl -N -X POST http://localhost:3000/chat/stream -H "Content-Type: application/json" -d '{"question":"List three feature flag tips."}'
Add everything to git:
git init
cat << EOF > .gitignore
.env
yarn-error.log
dist/
node_modules/
EOF
git add .
git commit -m "First checkin" -a
I've published this to github at: https://github.com/markcallen/chatbot-express-example
Next Steps
The next steps are now to add linting, docker and dev containers to stream-line your development process.
Conclusion
We’ve built a local-first chatbot API that uses Langchain to manage prompts and connect to Ollama, running the Qwen model locally.
By combining Typescript, Express and Langchain, we gain:
- Cost-free inference (no API tokens)
- Total control over model updates and behaviour
- Easy extensibility for new endpoints, memory, and tools
You can now integrate this backend into a web UI, CLI, or automation workflow, keeping everything under your control while still leveraging powerful LLM capabilities.