LMM Configs

In the CLI

In the ServisBOT CLI, llm-config is used to manage LLM configurations. These configurations can be used with the LLM Prompt V2 node in flows.

Commands

A number of commands are available for managing LLM configurations:

  • sb-cli llm-config list - Lists the existing LLM configurations in the organization.
  • sb-cli llm-config set - Sets the provided LLM configuration, creating it if it does not exist.
  • b-cli llm-config describe - Describes the provided LLM configuration.
  • sb-cli llm-config delete - Deletes the provided LLM configuration.
  • sb-cli llm-config execute-prompt - Executes the referenced LLM configuration with the provided prompt, context, and redaction keys.

Providers

Currently the only supported providers are OpenAI, Azure and Amazon Bedrock. A config is created under an alias which defines which provider and model is being used. It also contains what is expected in the request in the llmOptions object. To find out what each model expects in the request and what is returned in the response please refer to the documentation of the specific provider and model.

Configuration

Required configuration fields by provider:

Parameter OpenAI Azure Bedrock Notes
organization ✅ ✅ ✅ Identifier for the organization using the model
alias ✅ ✅ ✅ Alias for referencing the configuration
provider ✅ ✅ ✅ One of: Open_AI, Azure, Bedrock
config ✅ ✅ ✅ Object containing provider-specific fields
config.secretSRN ✅ ✅ Secret reference for API keys / tokens
config.endpoint ✅ Endpoint to use, required only for Azure OpenAI deployments.
config.model ✅ ✅ ✅ Model name or ID, depending on the provider

See the provider examples below for more details on each provider.

Context Object and Redaction

When executing a prompt, the context object is passed in the request. This object can contain arbitrary attributes, but only the following types will be processed for redaction:

  • An array of objects
  • An object
  • A string

Handling Object Arrays

If the context object contains an array of objects, the prompt must be formatted correctly using Mustache templating. For example:

Example prompt format:

"prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}"

Example context with object array (messages):

"context": {
  "messages": [
    {
      "actor": "agent:::Lisa",
      "message": "Hello, how can I help you today?"
    },
    {
      "actor": "user",
      "message": "My name is John Doe and my social security number is XXX-XX-XXXX. I need to check my account balance."
    },
    {
      "actor": "agent:::Lisa",
      "message": "I can help you with that. Your current balance is $1,500."
    },
    {
      "actor": "user",
      "message": "Great, thank you. That is all I needed."
    }
  ]
}

Redaction Behavior

If any attributes within the context object are specified in the redact array in the request, they will be redacted. Attributes not listed in redact will be ignored and remain unchanged.

Example redact array in request:

"redact": ["messages"]

Redacted output:

"context": {
  "messages": [
    {
      "actor": "agent:::Lisa",
      "message": "Hello, how can I help you today?"
    },
    {
      "actor": "user",
      "message": "My name is [REDACTED_NAME] and my social security number is [REDACTED_SSN]. I need to check my account balance."
    },
    {
      "actor": "agent:::Lisa",
      "message": "I can help you with that. Your current balance is $1,500."
    },
    {
      "actor": "user",
      "message": "Great, thank you. That is all I needed."
    }
  ]
}

Secrets and Authentication

Secrets must be stored using Token Authentication, with the API key set as the Bearer token.

All other configuration details required to communicate with the LLM provider are stored in the config object.

Provider Examples

OpenAI Example

{
 "organization": "your-org",
 "alias": "openai-alias",
 "provider": "Open_AI",
 "config": {
  "backOffMs": 248,
  "llmOptions": {
   "frequency_penalty": 0,
   "max_tokens": 400,
   "presence_penalty": 0,
   "seed": 12345,
   "stop": null,
   "temperature": 0
  },
  "maxRetries": 3,
  "model": "gpt-4o-mini",
  "prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}",
  "secretSRN": "srn:vault::your-org:secret:openai"
 }
}

Azure Example

{
 "organization": "some-org",
 "alias": "some-alias",
 "provider": "Azure",
 "config": {
  "backOffMs": 248,
  "endpoint": "some-endpoint",
  "llmOptions": {
   "frequency_penalty": 0,
   "max_tokens": 400,
   "presence_penalty": 0,
   "seed": 12345,
   "stop": null,
   "temperature": 0
  },
  "maxRetries": 3,
  "model": "gpt-35-turbo-default",
  "prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}",
  "secretSRN": "srn:vault::some-org:secret:azure"
 }
}

Amazon Bedrock Examples

Note that not all Bedrock models are available in every AWS Region. To use a specific model:

  1. Confirm its availability in the AWS region where the account is located - Models Support.
  2. Provision the model in the account: Some models may need to be explicitly provisioned within an AWS account before they can be used.
  3. If the model is only available in an AWS Region through an inference profile such as Meta’s Llama 3.2 in eu-west-1, then the following steps need to be taken:
  • Use the inference profile id instead of the model id in the config object Inference Profiles
  • Add the ARNs for the inference profile to the lambda policy for the invoke model command.

For example, for Meta’s Llama 3.2 1B Instruct model where the source region is eu-west-1 and the destination regions also include eu-central-1 and eu-west-3 the ARNs that need to be added to the lambda policy are as follows:

- !Sub arn:aws:bedrock:${AWS::Region}:${AWS::AccountId}:inference-profile/*
- arn:aws:bedrock:eu-central-1::foundation-model/meta.llama3*
- arn:aws:bedrock:eu-west-3::foundation-model/meta.llama3*
Sample Config for provider Bedrock and Amazon Titan Text models
{
"organization": "some-org",
"alias": "titanLite",
"config": {
"backOffMs": 248,
"llmOptions": {
 "textGenerationConfig": {
      "temperature": 0.5,  
      "topP": 0.5,
      "maxTokenCount": 200,
      "stopSequences": []
  }
},
"maxRetries": 3,
"model": "amazon.titan-text-lite-v1",
"prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}"
},
"provider": "Bedrock"
}
Sample Config for provider Bedrock and Meta’s Llama 3.2B instruct v1.0 model
{
 "organization": "some-org",
 "alias": "llama3_2instruct",
 "config": {
  "backOffMs": 248,
  "llmOptions": {
   "max_gen_len": 200,
   "temperature": 0.5,
   "top_p": 0.5
  },
  "maxRetries": 3,
  "model": "eu.meta.llama3-2-1b-instruct-v1:0",
  "prompt": "{{{randomContext}}} Please summarize this conversation:{{#messages}}-{{actor}}: {{message}}{{/messages}}"
 },
 "provider": "Bedrock"
}