ollama-d 0.4.0
D bindings for the Ollama API
To use this package, run the following command in your project's root directory:
Manual usage
Put the following dependency into your project's dependences section:
ollama-d
D language bindings for the Ollama REST API — seamless integration with local AI models.
Features
- Text generation with native Ollama API
- Chat interactions with conversation history
- Embeddings — single and batch (
/api/embed) - Tool calling — pass function definitions to
chat() - Multimodal — base64 image input for vision models
- Structured output — JSON schema
formatenforcement - Typed `OllamaOptions` — temperature, topk, numctx, stop sequences, and more
- Model management — list, create, show, pull, push, copy, delete
- Running models inspection (
/api/ps) - Configurable timeout settings
- Built-in unit test suite (
dub test) - OpenAI-compatible API endpoints (
/v1/…) - Agentic tool-calling loop — execute local functions and feed results back to the model
- Docker-based local development environment
- Zero external dependencies — only
std.net.curlandstd.json
Prerequisites
- D compiler (v2.110.0 stable or compatible)
- Ollama server running locally (default:
http://127.0.0.1:11434) - An installed model (e.g.
ollama pull llama3.2)
Quick Start
import ollama;
import std.stdio;
void main() {
auto client = new OllamaClient();
// Text generation
auto gen = client.generate("llama3.2", "Why is the sky blue?");
writeln(gen["response"].str);
// Chat
auto resp = client.chat("llama3.2", [Message("user", "Hello!")]);
writeln(resp["message"]["content"].str);
// Embeddings
auto emb = client.embed("nomic-embed-text", "The quick brown fox");
writeln("Vector length: ", emb["embeddings"][0].array.length);
// Server version
writeln("Ollama: ", client.getVersion());
}
API Reference
Generation
// Basic generation
JSONValue generate(string model, string prompt,
JSONValue options = JSONValue.init, bool stream = false,
string system = null, string[] images = null,
JSONValue format = JSONValue.init, string suffix = null,
string keepAlive = null, OllamaOptions opts = OllamaOptions.init)
// With typed options and system prompt
OllamaOptions opts;
opts.temperature = 0.7f;
opts.num_ctx = 4096;
opts.stop = ["<|end|>"];
auto r = client.generate("llama3.2", "Hello",
JSONValue.init, false, "You are a helpful assistant.",
null, JSONValue.init, null, null, opts);
writeln(r["response"].str);
// Structured JSON output
auto r = client.generate("llama3.2", "Capital of France as JSON",
JSONValue.init, false, null, null, JSONValue("json"));
Chat
// Basic chat
JSONValue chat(string model, Message[] messages,
JSONValue options = JSONValue.init, bool stream = false,
Tool[] tools = null, JSONValue format = JSONValue.init,
string keepAlive = null, OllamaOptions opts = OllamaOptions.init)
// Tool calling
import std.json : parseJSON;
auto schema = parseJSON(`{
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}`);
auto tools = [Tool("function",
ToolFunction("get_weather", "Get current weather", schema))];
auto r = client.chat("llama3.2",
[Message("user", "Weather in Paris?")],
JSONValue.init, false, tools);
// Check r["message"]["tool_calls"] for model's tool call request
Embeddings
// Single text
JSONValue embed(string model, string input, string keepAlive = null)
// Batch
JSONValue embed(string model, string[] inputs, string keepAlive = null)
auto r = client.embed("nomic-embed-text", "Hello, world!");
writeln(r["embeddings"][0].array.length); // vector dimension
auto batch = client.embed("nomic-embed-text", ["text one", "text two"]);
writeln(batch["embeddings"].array.length); // 2 vectors
Model Management
string listModels() // GET /api/tags
string showModel(string model) // POST /api/show
JSONValue createModel(string name, string file) // POST /api/create
JSONValue copy(string src, string dst) // POST /api/copy
JSONValue deleteModel(string name) // DELETE /api/delete
JSONValue pull(string name, bool stream = false) // POST /api/pull
JSONValue push(string name, bool stream = false) // POST /api/push (requires registry auth)
Server
string getVersion() // GET /api/version → "0.6.2"
string ps() // GET /api/ps → running models JSON
OpenAI-Compatible
JSONValue chatCompletions(string model, Message[] messages,
int maxTokens = 0, float temperature = 1.0, bool stream = false)
JSONValue completions(string model, string prompt,
int maxTokens = 0, float temperature = 1.0, bool stream = false)
string getModels() // GET /v1/models
Agentic Tool-Calling Loop
// Define tools
auto tools = [
Tool("function", ToolFunction("add", "Add two numbers",
parseJSON(`{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}},"required":["a","b"]}`))),
];
// Dispatch tool calls locally
JSONValue callTool(string name, JSONValue args) @safe
{
if (name == "add") return JSONValue(args["a"].get!double + args["b"].get!double);
return JSONValue("Unknown tool");
}
// Agentic loop
Message[] history = [Message("user", "What is 7 plus 8?")];
for (;;)
{
auto r = client.chat("llama3.1:8b", history, JSONValue.init, false, tools);
auto msg = r["message"];
auto tcs = "tool_calls" in msg.objectNoRef;
if (!tcs || (*tcs).arrayNoRef.length == 0) { writeln(msg["content"].str); break; }
history ~= Message("assistant", "content" in msg ? msg["content"].str : "");
foreach (tc; (*tcs).arrayNoRef)
{
auto result = callTool(tc["function"]["name"].str, tc["function"]["arguments"]);
history ~= Message("tool", result.toString);
}
}
OllamaOptions
OllamaOptions opts;
opts.temperature = 0.8f; // Creativity (0.0 = deterministic)
opts.top_k = 40; // Top-K sampling
opts.top_p = 0.9f; // Nucleus sampling
opts.repeat_penalty = 1.1f; // Penalize repeated tokens
opts.num_predict = 200; // Max tokens to generate
opts.num_ctx = 8192; // Context window size
opts.seed = 42; // Reproducible output
opts.stop = ["</s>", "\n\n"]; // Stop sequences
Samples
| Sample | Description |
|---|---|
simple | Full SDK tour — generation, chat, embeddings, tool calling, streaming, OpenAI-compatible endpoints |
chat | Interactive multi-turn chatbot with streaming output and Qwen3 thinking mode |
coder | CLI code generator — streams a chat response and saves the result to a file |
agent | Agentic tool-calling loop — the model calls local functions until it has a final answer |
Running Tests
# Unit tests (no Ollama server required)
dub test
# Build and run samples against a live server
dub build -b release
dub run -b release :simple
dub run -b release :agent -- "What time is it and what is 7 plus 8?"
dub run -b release :agent -- --model llama3.1:8b "Convert 'hello world' to upper case"
dub run -b release :coder -- --prompt "Sort a list in D" --model llama3.1:8b
dub run -b release :chat -- --model qwen3:0.6b
Docker
# Unit tests only (no Ollama needed)
docker build --target builder -t ollama-d .
docker run --rm ollama-d dub test
# Full integration tests with Ollama
docker compose up --build
# Tear down
docker compose down -v
License
MIT License
- 0.4.0 released 2 months ago
- kassane/ollama-d
- MIT
- Authors:
- Sub packages:
- ollama-d:simple, ollama-d:coder, ollama-d:chat, ollama-d:agent
- Dependencies:
- none
- Versions:
-
Show all 6 versions0.4.0 2026-Mar-07 0.3.2 2025-Mar-22 0.3.1 2025-Mar-21 0.3.0 2025-Mar-21 0.2.0 2025-Mar-20 - Download Stats:
-
-
0 downloads today
-
0 downloads this week
-
0 downloads this month
-
21 downloads total
-
- Score:
- 0.6
- Short URL:
- ollama-d.dub.pm