Hugging Face Profile
The Hugging Face Profile supports both hosted models and custom endpoints, offering the most flexible model configuration in the library. This page describes how to select and configure Hugging Face models for use within the UnO Agentic AI Builder. To ensure seamless interaction within the chat interface and agentic workflows, users must select models specifically optimized for instruction following.
Before you begin
-
You must have a valid Hugging Face Credential configured in the Credential Library to authenticate this connection.
-
Ensure that all mandatory fields (marked with *) are completed accurately.
Tested Models
While the UnO Agentic AI Builder can connect to various models hosted on Hugging Face, only the following models currently support the Chat functionality required for interactive agents:
- Qwen/Qwen2.5-7B-Instruct
- Qwen/Qwen2.5-1.5B-Instruct
- openai/gpt-oss-20b
The following models have been tested but are not supported for Chat functionality in the current version:
-
Incompatible Instruct Models:
Qwen/Qwen2.5-3B-Instruct,meta-llama/Llama-3.1-8B-Instruct, anddphn/dolphin-2.9.1-yi-1.5-34b. -
Base Models:
Qwen/Qwen3-4B,Qwen/Qwen3-0.6B, andopenai-community/gpt2. These models lack the instruction-tuning required to follow conversational logic and will not function correctly within the Agentic AI Builder.
| Option | Description |
|---|---|
| Profile name |
A unique identifier for this configuration instance. This name will be used to reference this specific Hugging Face model setup in the Agentic AI Builder. |
| Models available | Select a supported model from the dropdown list. |
| Options | Description |
|---|---|
| Other model (overrides mandatory selection above) | Type a specific model name if it does not appear in the standard dropdown list. This overrides the "Models available" selection. |
| Temperature | Controls the randomness of the output (range: 0.0 to 1.0). Lower values make the output more deterministic. |
| Max New Tokens | The maximum number of tokens to generate in the response. |
| Top P | Nucleus sampling: The model considers the smallest set of tokens
whose cumulative probability exceeds the threshold
top_p. |
| Top K | Samples from the k tokens with the highest
probability. |
| Repetition Penalty | Prevents the model from repeating the same text. A value of
1.0 means no penalty. |
| Do Sample | (Checkbox) Enables sampling mode (checked by default). |
| Streaming | (Checkbox) Enables real-time token streaming for faster perceived response times. |
| Endpoint URL | (Optional) Provide a custom inference endpoint URL if you are using a dedicated Hugging Face Inference Endpoint. |
| Task | Define the specific task type. The default is usually set to
text-generation. |