AI Plugin

Client-side AI plugin providing Text Inference (LLM) capabilities. All processing runs locally on-device without server dependencies.

Features

TextInference: Local LLM chat with streaming responses
AiModelManager: Model download, caching, and registry management
Cross-platform: Desktop (macOS/Windows/Linux), Browser (WASM), Mobile (iOS/Android)
Automatic model downloads from HuggingFace

Quick Start

import Clayground.Ai

TextInference {
    id: llm
    modelId: "smollm2-1.7b"
    systemPrompt: "You are a helpful assistant."

    onToken: (tok) => console.log(tok)
    onResponse: (full) => console.log("Done:", full)
}

Button {
    text: "Ask"
    onClicked: llm.send("Hello, what can you do?")
}

Components

TextInference

Local LLM text generation with automatic model management.

Properties:

modelId: Model to use (triggers auto-download)
systemPrompt: System prompt for conversation
maxTokens: Maximum tokens per response
temperature: Sampling temperature (0.0-2.0)
modelReady: Whether model is loaded
generating: Whether generation is in progress
downloading: Whether model is downloading
downloadProgress: Download progress (0.0-1.0)

Methods:

send(message): Send user message
stop(): Stop generation
clear(): Clear conversation
unload(): Unload model

Signals:

token(string): Emitted per token (streaming)
response(string): Emitted when complete
error(string): Emitted on error

AiModelManager

Manages model downloads and caching.

Properties:

registryUrl: Custom model registry URL
hasWebGPU: WebGPU availability (browser)
platform: Current platform
activeDownloads: In-progress downloads

Methods:

isAvailable(modelId): Check if cached
modelInfo(modelId): Get model metadata
availableModels(type): List models (“llm”, “stt”, “tts”)
download(modelId): Start download
cancelDownload(modelId): Cancel download
checkMemory(modelId): Check memory requirements

Available Models

Model	Size	Platform	Use Case
smollm2-1.7b	~1 GB	Desktop, WebGPU	Best quality for size
smollm2-360m	~230 MB	All	Lightweight, fast
qwen2.5-1.5b	~986 MB	Desktop, WebGPU	Better reasoning
llama3.2-1b	~776 MB	All	Meta optimized

Platform Notes

Desktop (macOS)

Uses llama.cpp with Metal acceleration
Models cached in ~/.cache/clayground_ai/models/

Browser (WASM)

Uses wllama (llama.cpp WASM binding)
Models cached in IndexedDB
WebGPU auto-detected for faster inference

Mobile

CPU inference only
Use smaller models (smollm2-360m) for better performance

Future Ideas

TextToSpeech: Client-side TTS using sherpa-onnx
SpeechToText: Client-side STT using whisper.cpp

API Reference

AiModelManager Manages AI model downloads, caching, and registry

View full documentation

Properties

Name	Type	Description
`activeDownloads` readonly	`list`	List of currently active downloads
`hasWebGPU` readonly	`bool`	Whether WebGPU is available (browser only)
`platform` readonly	`string`	Current platform: "desktop", "wasm", "ios", or "android"
`registryReady` readonly	`bool`	Whether the model registry has been loaded
`registryUrl`	`url`	Custom model registry URL

Methods

Method	Returns	Description
`availableModels(string type)`	`list`
`cachedModels()`	`list`
`cancelDownload(string modelId)`	`void`
`checkMemory(string modelId)`	`bool`
`download(string modelId)`	`void`
`isAvailable(string modelId)`	`bool`
`modelInfo(string modelId)`	`object`
`modelPath(string modelId)`	`string`
`refreshRegistry()`	`void`
`remove(string modelId)`	`void`

Signals

Signal	Description
`downloadCancelled(string modelId)`
`downloadComplete(string modelId)`
`downloadError(string modelId, string message)`
`downloadProgress(string modelId, real progress, int bytesDownloaded, int totalBytes)`
`downloadStarted(string modelId, int totalBytes)`
`registryUpdated()`

AiModelManagerBackend C++ backend for downloading and managing AI models

View full documentation

Properties

Name	Type	Description
`activeDownloads` readonly	`list`	List of currently active downloads
`hasWebGPU` readonly	`bool`	Whether WebGPU is available (browser only)
`platform` readonly	`string`	Current platform identifier
`registryReady` readonly	`bool`	Whether the model registry has been loaded
`registryUrl`	`url`	Custom model registry URL

Methods

Method	Returns	Description
`availableModels(string type)`	`list`
`cachedModels()`	`list`
`cancelDownload(string modelId)`	`void`
`checkMemory(string modelId)`	`bool`
`download(string modelId)`	`void`
`isAvailable(string modelId)`	`bool`
`modelInfo(string modelId)`	`object`
`modelPath(string modelId)`	`string`
`refreshRegistry()`	`void`
`remove(string modelId)`	`void`

Signals

Signal	Description
`downloadCancelled(string modelId)`
`downloadComplete(string modelId)`
`downloadError(string modelId, string message)`
`downloadProgress(string modelId, real progress, int bytesDownloaded, int totalBytes)`
`downloadStarted(string modelId, int totalBytes)`
`registryUpdated()`

LlmEngineBackend C++ backend for local LLM inference using llama.cpp

View full documentation

Properties

Name	Type	Description
`currentResponse` readonly	`string`	Response text accumulated so far during generation
`generating` readonly	`bool`	Whether text generation is in progress
`loadProgress` readonly	`real`	Model loading progress (0.0 to 1.0)
`maxTokens`	`int`	Maximum number of tokens to generate per response
`modelLoading` readonly	`bool`	Whether the model is currently being loaded
`modelPath`	`string`	Path to the GGUF model file
`modelReady` readonly	`bool`	Whether the model is loaded and ready for inference
`systemPrompt`	`string`	System prompt prepended to every conversation
`temperature`	`real`	Sampling temperature (0.0 to 2.0)

Methods

Method	Returns	Description
`clear()`	`void`
`send(string message)`	`void`
`stop()`	`void`
`unload()`	`void`

Signals

Signal	Description
`error(string message)`
`response(string fullText)`
`token(string token)`

Sandbox Test sandbox for AI plugin components

View full documentation

TextInference Client-side LLM text generation

View full documentation

Properties

Name	Type	Description
`currentResponse` readonly	`string`	Current response being generated
`downloadProgress` readonly	`real`	Download progress (0.0 to 1.0)
`downloadedBytes` readonly	`int`	Bytes downloaded so far
`downloading` readonly	`bool`	Whether the model is being downloaded
`generating` readonly	`bool`	Whether text generation is in progress
`loadProgress` readonly	`real`	Model loading progress (0.0 to 1.0)
`maxTokens`	`int`	Maximum tokens to generate per response
`modelId`	`string`	Model to use for inference
`modelLoading` readonly	`bool`	Whether the model is being loaded into memory
`modelReady` readonly	`bool`	Whether the model is loaded and ready for inference
`noModel` readonly	`string`	Special value to cancel download/unload model
`systemPrompt`	`string`	System prompt for the conversation
`temperature`	`real`	Sampling temperature (0.0 to 2.0)
`totalBytes` readonly	`int`	Total bytes to download

Methods

Method	Returns	Description
`clear()`	`void`
`send(string message)`	`void`
`stop()`	`void`
`unload()`	`void`

Signals

Signal	Description
`downloadCancelled()`
`downloadStarted(int totalBytes)`
`error(string message)`
`modelDownloaded()`
`modelReadySignal()`
`response(string fullText)`
`token(string token)`

Search Results

AI Plugin

Features

Quick Start

Components

TextInference

AiModelManager

Available Models

Platform Notes

Desktop (macOS)

Browser (WASM)

Mobile

Future Ideas

API Reference

Properties

Methods

Signals

Properties

Methods

Signals

Properties

Methods

Signals

Properties

Methods

Signals