C++ backend for local LLM inference using llama.cpp. More...
| Import Statement: | import Clayground.Ai |
LlmEngineBackend loads GGUF models and runs text generation on a background thread. It is typically used internally by the TextInference QML component.
See also TextInference.
currentResponse : string |
The response text accumulated so far during generation.
generating : bool |
Whether text generation is in progress.
loadProgress : real |
Model loading progress (0.0 to 1.0).
maxTokens : int |
Maximum number of tokens to generate per response.
Defaults to 256.
modelLoading : bool |
Whether the model is currently being loaded.
modelPath : string |
Path to the GGUF model file.
Setting this property triggers model loading. Set to empty to unload.
modelReady : bool |
Whether the model is loaded and ready for inference.
systemPrompt : string |
System prompt prepended to every conversation.
temperature : real |
Sampling temperature (0.0 to 2.0).
Lower values produce more deterministic output. Defaults to 0.7.
error(string message) |
Emitted when an error occurs.
Note: The corresponding handler is onError.
response(string fullText) |
Emitted when generation completes with the full response.
Note: The corresponding handler is onResponse.
token(string token) |
Emitted for each generated token (streaming).
Note: The corresponding handler is onToken.
void clear() |
Clear the conversation history.
void send(string message) |
Send a message and start generating a response.
void stop() |
Stop the current generation.
void unload() |
Unload the model from memory.