Package ollama-d on DUB

To use this package, run the following command in your project's root directory:

ollama-d

Latest release

D language bindings for the Ollama REST API — seamless integration with local AI models.

Features

Text generation with native Ollama API
Chat interactions with conversation history
Embeddings — single and batch (/api/embed)
Tool calling — pass function definitions to chat()
Multimodal — base64 image input for vision models
Structured output — JSON schema format enforcement
Typed `OllamaOptions` — temperature, topk, numctx, stop sequences, and more
Model management — list, create, show, pull, push, copy, delete
Running models inspection (/api/ps)
Configurable timeout settings
Built-in unit test suite (dub test)
OpenAI-compatible API endpoints (/v1/…)
Agentic tool-calling loop — execute local functions and feed results back to the model
Docker-based local development environment
Zero external dependencies — only std.net.curl and std.json

Prerequisites

D compiler (v2.110.0 stable or compatible)
Ollama server running locally (default: http://127.0.0.1:11434)
An installed model (e.g. ollama pull llama3.2)

Quick Start

import ollama;
import std.stdio;

void main() {
    auto client = new OllamaClient();

    // Text generation
    auto gen = client.generate("llama3.2", "Why is the sky blue?");
    writeln(gen["response"].str);

    // Chat
    auto resp = client.chat("llama3.2", [Message("user", "Hello!")]);
    writeln(resp["message"]["content"].str);

    // Embeddings
    auto emb = client.embed("nomic-embed-text", "The quick brown fox");
    writeln("Vector length: ", emb["embeddings"][0].array.length);

    // Server version
    writeln("Ollama: ", client.getVersion());
}

API Reference

Generation

// Basic generation
JSONValue generate(string model, string prompt,
    JSONValue options = JSONValue.init, bool stream = false,
    string system = null, string[] images = null,
    JSONValue format = JSONValue.init, string suffix = null,
    string keepAlive = null, OllamaOptions opts = OllamaOptions.init)

// With typed options and system prompt
OllamaOptions opts;
opts.temperature = 0.7f;
opts.num_ctx     = 4096;
opts.stop        = ["<|end|>"];
auto r = client.generate("llama3.2", "Hello",
    JSONValue.init, false, "You are a helpful assistant.",
    null, JSONValue.init, null, null, opts);
writeln(r["response"].str);

// Structured JSON output
auto r = client.generate("llama3.2", "Capital of France as JSON",
    JSONValue.init, false, null, null, JSONValue("json"));

Chat

// Basic chat
JSONValue chat(string model, Message[] messages,
    JSONValue options = JSONValue.init, bool stream = false,
    Tool[] tools = null, JSONValue format = JSONValue.init,
    string keepAlive = null, OllamaOptions opts = OllamaOptions.init)

// Tool calling
import std.json : parseJSON;
auto schema = parseJSON(`{
    "type": "object",
    "properties": {"location": {"type": "string"}},
    "required": ["location"]
}`);
auto tools = [Tool("function",
    ToolFunction("get_weather", "Get current weather", schema))];
auto r = client.chat("llama3.2",
    [Message("user", "Weather in Paris?")],
    JSONValue.init, false, tools);
// Check r["message"]["tool_calls"] for model's tool call request

Embeddings

// Single text
JSONValue embed(string model, string input, string keepAlive = null)

// Batch
JSONValue embed(string model, string[] inputs, string keepAlive = null)

auto r = client.embed("nomic-embed-text", "Hello, world!");
writeln(r["embeddings"][0].array.length); // vector dimension

auto batch = client.embed("nomic-embed-text", ["text one", "text two"]);
writeln(batch["embeddings"].array.length); // 2 vectors

Model Management

string    listModels()                            // GET    /api/tags
string    showModel(string model)                 // POST   /api/show
JSONValue createModel(string name, string file)   // POST   /api/create
JSONValue copy(string src, string dst)            // POST   /api/copy
JSONValue deleteModel(string name)                // DELETE /api/delete
JSONValue pull(string name, bool stream = false)  // POST   /api/pull
JSONValue push(string name, bool stream = false)  // POST   /api/push (requires registry auth)

Server

string getVersion()  // GET /api/version  → "0.6.2"
string ps()          // GET /api/ps       → running models JSON

OpenAI-Compatible

JSONValue chatCompletions(string model, Message[] messages,
    int maxTokens = 0, float temperature = 1.0, bool stream = false)

JSONValue completions(string model, string prompt,
    int maxTokens = 0, float temperature = 1.0, bool stream = false)

string getModels()  // GET /v1/models

Agentic Tool-Calling Loop

// Define tools
auto tools = [
    Tool("function", ToolFunction("add", "Add two numbers",
        parseJSON(`{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}},"required":["a","b"]}`))),
];

// Dispatch tool calls locally
JSONValue callTool(string name, JSONValue args) @safe
{
    if (name == "add") return JSONValue(args["a"].get!double + args["b"].get!double);
    return JSONValue("Unknown tool");
}

// Agentic loop
Message[] history = [Message("user", "What is 7 plus 8?")];
for (;;)
{
    auto r  = client.chat("llama3.1:8b", history, JSONValue.init, false, tools);
    auto msg = r["message"];
    auto tcs = "tool_calls" in msg.objectNoRef;
    if (!tcs || (*tcs).arrayNoRef.length == 0) { writeln(msg["content"].str); break; }

    history ~= Message("assistant", "content" in msg ? msg["content"].str : "");
    foreach (tc; (*tcs).arrayNoRef)
    {
        auto result = callTool(tc["function"]["name"].str, tc["function"]["arguments"]);
        history ~= Message("tool", result.toString);
    }
}

OllamaOptions

OllamaOptions opts;
opts.temperature   = 0.8f;   // Creativity (0.0 = deterministic)
opts.top_k         = 40;     // Top-K sampling
opts.top_p         = 0.9f;   // Nucleus sampling
opts.repeat_penalty = 1.1f;  // Penalize repeated tokens
opts.num_predict   = 200;    // Max tokens to generate
opts.num_ctx       = 8192;   // Context window size
opts.seed          = 42;     // Reproducible output
opts.stop          = ["</s>", "\n\n"];  // Stop sequences

Samples

Sample	Description
`simple`	Full SDK tour — generation, chat, embeddings, tool calling, streaming, OpenAI-compatible endpoints
`chat`	Interactive multi-turn chatbot with streaming output and Qwen3 thinking mode
`coder`	CLI code generator — streams a chat response and saves the result to a file
`agent`	Agentic tool-calling loop — the model calls local functions until it has a final answer

Running Tests

# Unit tests (no Ollama server required)
dub test

# Build and run samples against a live server
dub build -b release
dub run -b release :simple
dub run -b release :agent -- "What time is it and what is 7 plus 8?"
dub run -b release :agent -- --model llama3.1:8b "Convert 'hello world' to upper case"
dub run -b release :coder -- --prompt "Sort a list in D" --model llama3.1:8b
dub run -b release :chat  -- --model qwen3:0.6b

Docker

# Unit tests only (no Ollama needed)
docker build --target builder -t ollama-d .
docker run --rm ollama-d dub test

# Full integration tests with Ollama
docker compose up --build

# Tear down
docker compose down -v

License

MIT License

0.4.0	2026-Mar-07
0.3.2	2025-Mar-22
0.3.1	2025-Mar-21
0.3.0	2025-Mar-21
0.2.0	2025-Mar-20

ollama-d 0.4.0

ollama-d

Features

Prerequisites

Quick Start

API Reference

Generation

Chat

Embeddings

Model Management

Server

OpenAI-Compatible

Agentic Tool-Calling Loop

OllamaOptions

Samples

Running Tests

Docker

License