In the ServisBOT CLI, llm-config
is used to manage LLM configurations. These configurations can be used with the LLM Prompt V2 node in flows.
A number of commands are available for managing LLM configurations:
sb-cli llm-config list
- Lists the existing LLM configurations in the organization.sb-cli llm-config set
- Sets the provided LLM configuration, creating it if it does not exist.b-cli llm-config describe
- Describes the provided LLM configuration.sb-cli llm-config delete
- Deletes the provided LLM configuration.sb-cli llm-config execute-prompt
- Executes the referenced LLM configuration with the provided prompt, context, and redaction keys.Currently the only supported providers are OpenAI, Azure and Amazon Bedrock. A config is created under an alias which defines which provider and model is being used. It also contains what is expected in the request in the llmOptions object. To find out what each model expects in the request and what is returned in the response please refer to the documentation of the specific provider and model.
Required configuration fields by provider:
Parameter | OpenAI | Azure | Bedrock | Notes |
---|---|---|---|---|
organization | ✅ | ✅ | ✅ | Identifier for the organization using the model |
alias | ✅ | ✅ | ✅ | Alias for referencing the configuration |
provider | ✅ | ✅ | ✅ | One of: Open_AI, Azure, Bedrock |
config | ✅ | ✅ | ✅ | Object containing provider-specific fields |
config.secretSRN | ✅ | ✅ | Secret reference for API keys / tokens | |
config.endpoint | ✅ | Endpoint to use, required only for Azure OpenAI deployments. | ||
config.model | ✅ | ✅ | ✅ | Model name or ID, depending on the provider |
See the provider examples below for more details on each provider.
When executing a prompt, the context object is passed in the request. This object can contain arbitrary attributes, but only the following types will be processed for redaction:
If the context object contains an array of objects, the prompt must be formatted correctly using Mustache templating. For example:
Example prompt format:
"prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}"
Example context with object array (messages):
"context": {
"messages": [
{
"actor": "agent:::Lisa",
"message": "Hello, how can I help you today?"
},
{
"actor": "user",
"message": "My name is John Doe and my social security number is XXX-XX-XXXX. I need to check my account balance."
},
{
"actor": "agent:::Lisa",
"message": "I can help you with that. Your current balance is $1,500."
},
{
"actor": "user",
"message": "Great, thank you. That is all I needed."
}
]
}
If any attributes within the context object are specified in the redact array in the request, they will be redacted. Attributes not listed in redact will be ignored and remain unchanged.
Example redact array in request:
"redact": ["messages"]
Redacted output:
"context": {
"messages": [
{
"actor": "agent:::Lisa",
"message": "Hello, how can I help you today?"
},
{
"actor": "user",
"message": "My name is [REDACTED_NAME] and my social security number is [REDACTED_SSN]. I need to check my account balance."
},
{
"actor": "agent:::Lisa",
"message": "I can help you with that. Your current balance is $1,500."
},
{
"actor": "user",
"message": "Great, thank you. That is all I needed."
}
]
}
Secrets must be stored using Token Authentication, with the API key set as the Bearer token.
All other configuration details required to communicate with the LLM provider are stored in the config
object.
{
"organization": "your-org",
"alias": "openai-alias",
"provider": "Open_AI",
"config": {
"backOffMs": 248,
"llmOptions": {
"frequency_penalty": 0,
"max_tokens": 400,
"presence_penalty": 0,
"seed": 12345,
"stop": null,
"temperature": 0
},
"maxRetries": 3,
"model": "gpt-4o-mini",
"prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}",
"secretSRN": "srn:vault::your-org:secret:openai"
}
}
{
"organization": "some-org",
"alias": "some-alias",
"provider": "Azure",
"config": {
"backOffMs": 248,
"endpoint": "some-endpoint",
"llmOptions": {
"frequency_penalty": 0,
"max_tokens": 400,
"presence_penalty": 0,
"seed": 12345,
"stop": null,
"temperature": 0
},
"maxRetries": 3,
"model": "gpt-35-turbo-default",
"prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}",
"secretSRN": "srn:vault::some-org:secret:azure"
}
}
Note that not all Bedrock models are available in every AWS Region. To use a specific model:
For example, for Meta’s Llama 3.2 1B Instruct model where the source region is eu-west-1
and the destination regions also include eu-central-1
and eu-west-3
the ARNs that need to be added to the lambda policy are as follows:
- !Sub arn:aws:bedrock:${AWS::Region}:${AWS::AccountId}:inference-profile/*
- arn:aws:bedrock:eu-central-1::foundation-model/meta.llama3*
- arn:aws:bedrock:eu-west-3::foundation-model/meta.llama3*
{
"organization": "some-org",
"alias": "titanLite",
"config": {
"backOffMs": 248,
"llmOptions": {
"textGenerationConfig": {
"temperature": 0.5,
"topP": 0.5,
"maxTokenCount": 200,
"stopSequences": []
}
},
"maxRetries": 3,
"model": "amazon.titan-text-lite-v1",
"prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}"
},
"provider": "Bedrock"
}
{
"organization": "some-org",
"alias": "llama3_2instruct",
"config": {
"backOffMs": 248,
"llmOptions": {
"max_gen_len": 200,
"temperature": 0.5,
"top_p": 0.5
},
"maxRetries": 3,
"model": "eu.meta.llama3-2-1b-instruct-v1:0",
"prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}"
},
"provider": "Bedrock"
}