Search Results

AI Plugin

Client-side AI plugin providing Text Inference (LLM) capabilities. All processing runs locally on-device without server dependencies.

Features

  • TextInference: Local LLM chat with streaming responses
  • AiModelManager: Model download, caching, and registry management
  • Cross-platform: Desktop (macOS/Windows/Linux), Browser (WASM), Mobile (iOS/Android)
  • Automatic model downloads from HuggingFace

Quick Start

import Clayground.Ai

TextInference {
    id: llm
    modelId: "smollm2-1.7b"
    systemPrompt: "You are a helpful assistant."

    onToken: (tok) => console.log(tok)
    onResponse: (full) => console.log("Done:", full)
}

Button {
    text: "Ask"
    onClicked: llm.send("Hello, what can you do?")
}

Components

TextInference

Local LLM text generation with automatic model management.

Properties:

  • modelId: Model to use (triggers auto-download)
  • systemPrompt: System prompt for conversation
  • maxTokens: Maximum tokens per response
  • temperature: Sampling temperature (0.0-2.0)
  • modelReady: Whether model is loaded
  • generating: Whether generation is in progress
  • downloading: Whether model is downloading
  • downloadProgress: Download progress (0.0-1.0)

Methods:

  • send(message): Send user message
  • stop(): Stop generation
  • clear(): Clear conversation
  • unload(): Unload model

Signals:

  • token(string): Emitted per token (streaming)
  • response(string): Emitted when complete
  • error(string): Emitted on error

AiModelManager

Manages model downloads and caching.

Properties:

  • registryUrl: Custom model registry URL
  • hasWebGPU: WebGPU availability (browser)
  • platform: Current platform
  • activeDownloads: In-progress downloads

Methods:

  • isAvailable(modelId): Check if cached
  • modelInfo(modelId): Get model metadata
  • availableModels(type): List models (“llm”, “stt”, “tts”)
  • download(modelId): Start download
  • cancelDownload(modelId): Cancel download
  • checkMemory(modelId): Check memory requirements

Available Models

Model Size Platform Use Case
smollm2-1.7b ~1 GB Desktop, WebGPU Best quality for size
smollm2-360m ~230 MB All Lightweight, fast
qwen2.5-1.5b ~986 MB Desktop, WebGPU Better reasoning
llama3.2-1b ~776 MB All Meta optimized

Platform Notes

Desktop (macOS)

  • Uses llama.cpp with Metal acceleration
  • Models cached in ~/.cache/clayground_ai/models/

Browser (WASM)

  • Uses wllama (llama.cpp WASM binding)
  • Models cached in IndexedDB
  • WebGPU auto-detected for faster inference

Mobile

  • CPU inference only
  • Use smaller models (smollm2-360m) for better performance

Future Ideas

  • TextToSpeech: Client-side TTS using sherpa-onnx
  • SpeechToText: Client-side STT using whisper.cpp

API Reference

AiModelManager Manages AI model downloads, caching, and registry

Properties

NameTypeDescription
activeDownloads readonlylistList of currently active downloads
hasWebGPU readonlyboolWhether WebGPU is available (browser only)
platform readonlystringCurrent platform: "desktop", "wasm", "ios", or "android"
registryReady readonlyboolWhether the model registry has been loaded
registryUrlurlCustom model registry URL

Methods

MethodReturnsDescription
availableModels(string type)list
cachedModels()list
cancelDownload(string modelId)void
checkMemory(string modelId)bool
download(string modelId)void
isAvailable(string modelId)bool
modelInfo(string modelId)object
modelPath(string modelId)string
refreshRegistry()void
remove(string modelId)void

Signals

SignalDescription
downloadCancelled(string modelId)
downloadComplete(string modelId)
downloadError(string modelId, string message)
downloadProgress(string modelId, real progress, int bytesDownloaded, int totalBytes)
downloadStarted(string modelId, int totalBytes)
registryUpdated()
AiModelManagerBackend C++ backend for downloading and managing AI models

Properties

NameTypeDescription
activeDownloads readonlylistList of currently active downloads
hasWebGPU readonlyboolWhether WebGPU is available (browser only)
platform readonlystringCurrent platform identifier
registryReady readonlyboolWhether the model registry has been loaded
registryUrlurlCustom model registry URL

Methods

MethodReturnsDescription
availableModels(string type)list
cachedModels()list
cancelDownload(string modelId)void
checkMemory(string modelId)bool
download(string modelId)void
isAvailable(string modelId)bool
modelInfo(string modelId)object
modelPath(string modelId)string
refreshRegistry()void
remove(string modelId)void

Signals

SignalDescription
downloadCancelled(string modelId)
downloadComplete(string modelId)
downloadError(string modelId, string message)
downloadProgress(string modelId, real progress, int bytesDownloaded, int totalBytes)
downloadStarted(string modelId, int totalBytes)
registryUpdated()
LlmEngineBackend C++ backend for local LLM inference using llama.cpp

Properties

NameTypeDescription
currentResponse readonlystringResponse text accumulated so far during generation
generating readonlyboolWhether text generation is in progress
loadProgress readonlyrealModel loading progress (0.0 to 1.0)
maxTokensintMaximum number of tokens to generate per response
modelLoading readonlyboolWhether the model is currently being loaded
modelPathstringPath to the GGUF model file
modelReady readonlyboolWhether the model is loaded and ready for inference
systemPromptstringSystem prompt prepended to every conversation
temperaturerealSampling temperature (0.0 to 2.0)

Methods

MethodReturnsDescription
clear()void
send(string message)void
stop()void
unload()void

Signals

SignalDescription
error(string message)
response(string fullText)
token(string token)
Sandbox Test sandbox for AI plugin components
TextInference Client-side LLM text generation

Properties

NameTypeDescription
currentResponse readonlystringCurrent response being generated
downloadProgress readonlyrealDownload progress (0.0 to 1.0)
downloadedBytes readonlyintBytes downloaded so far
downloading readonlyboolWhether the model is being downloaded
generating readonlyboolWhether text generation is in progress
loadProgress readonlyrealModel loading progress (0.0 to 1.0)
maxTokensintMaximum tokens to generate per response
modelIdstringModel to use for inference
modelLoading readonlyboolWhether the model is being loaded into memory
modelReady readonlyboolWhether the model is loaded and ready for inference
noModel readonlystringSpecial value to cancel download/unload model
systemPromptstringSystem prompt for the conversation
temperaturerealSampling temperature (0.0 to 2.0)
totalBytes readonlyintTotal bytes to download

Methods

MethodReturnsDescription
clear()void
send(string message)void
stop()void
unload()void

Signals

SignalDescription
downloadCancelled()
downloadStarted(int totalBytes)
error(string message)
modelDownloaded()
modelReadySignal()
response(string fullText)
token(string token)