Skip to content

Generative AI Module Reference

This documentation provides a reference guide for the Generative AI Module.
The Generative AI Module enables you to perform various operations such as chat, RAG chat, agent-based interactions, retrieving information from a knowledge base, and adding new entries to the knowledge base.

Connection Configurations

LLM Connection

LLM (Large Language Model) connections are used to connect to various LLM providers. These connections are essential for enabling the Generative AI Module to interact with different LLM services.

LLM Provider Connections

The WSO2 MI Generative AI Module supports multiple LLM providers, allowing seamless integration with llm models.

  • Anthropic

  • Mistral AI

  • Open AI

  • Azure Open AI

  • Deepseek

Connection Configuration Parameters

The connection configuration parameters are used to establish a connection with the LLM provider. These parameters are needed to provide based on the provider you are using.

Anthropic

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
apiKey OpenAI Key API key used to authenticate with the OpenAI service. Yes

Mistral AI

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
apiKey MistralAI Key API key used to authenticate with the Mistral AI service. Yes

Open AI

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
apiKey OpenAI Key API key used to authenticate with the OpenAI service. Yes
baseUrl Base Url Base URL of the OpenAI service. Yes

Azure Open AI

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
apiKey Azure OpenAI Key API key used to authenticate with the Azure OpenAI service. Yes
deploymentName Deployment Name Name of the deployment in Azure OpenAI. Yes
endpoint Endpoint Endpoint URL of the Azure OpenAI service. Yes

Deepseek

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
apiKey Deepseek API Key API key used to authenticate with the Deepseek service. Yes
baseUrl Base Url Base URL of the Deepseek service. Yes

Embedding Model Connection

Embedding model connections are used to connect to various embedding model providers. These connections are essential for enabling the Generative AI Module to interact with different embedding model services.

Embedding Model Provider Connections

The WSO2 MI Generative AI Module supports Open AI embedding model provider.

Connection Configuration Parameters

The connection configuration parameters are used to establish a connection with the Embedding model provider. These parameters are needed to provide based on the provider you are using.

Open AI

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
apiKey OpenAI Key API key used to authenticate with the OpenAI service. Yes

Vector Database (Knowledgebase) Connection

Vector database connections are used to connect to various vector database providers. These connections are essential for enabling the Generative AI Module to interact with different vector database services.

Vector Store Connections

The WSO2 MI Generative AI Module supports multiple vector stores.

  • Chroma DB
    Chroma is an open-source vector database designed for AI applications, enabling efficient storage and retrieval of embeddings.

  • Postgres DB
    Postgres DB is a powerful, open-source relational database that can be used as a vector store for managing embeddings.

    Note

    To use Postgres DB as a vector store, ensure the pgvector extension is installed on the machine where the Postgres DB is running.
    After installation, execute the following SQL command in your Postgres database to enable the creation of vector columns:

    CREATE EXTENSION vector;
    
  • Pinecone DB
    Pinecone is a managed vector database service that provides high-performance and scalable storage for embeddings.

  • MI Vector Store
    MI Vector Store is an in-memory vector database offered by WSO2 MI which will persist the data in the MI resources.

    Info

    MI Vector Store is designed specifically for testing purposes and is not recommended for production use. Please use other vector stores for production scenarios.

Connection Configuration Parameters

The connection configuration parameters are used to establish a connection with the vector store. These parameters are needed to provide based on the store you are using.

Chroma DB

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
url Base URL Base URL of the Chroma service. Yes
collection Collection Name of the collection in Chroma. Yes

Postgres DB

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
host Host Host name of the Postgres DB. Yes
port Port Port number of the Postgres DB. Yes
database Database Name of the database in Postgres DB. Yes
user User User name to connect to the Postgres DB. Yes
password Password Password to connect to the Postgres DB. Yes
table Table Name of the table in Postgres DB. Yes
dimension Dimension of embeddings Dimension of the embeddings used in the Postgres DB. Yes

Pinecone DB

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
apiKey API Key API key used to authenticate with the Pinecone service. Yes
index Index Name of the index in Pinecone. Yes
namespace Namespace Namespace of the index in Pinecone. Yes
cloud Cloud Cloud provider of the Pinecone service. Yes
region Region Region of the Pinecone service. Yes
dimension Dimension of embeddings Dimension of the embeddings used in the Pinecone service. Yes

MI Vector Store

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes

Chat Memory Connection

Chat memory connections are used to connect to various chat memory providers. Chat memory is essential for enabling the Generative AI Module to manage chat history and context.

Chat Memory Connections

The WSO2 MI Generative AI Module supports multiple chat memory providers.

  • Postgres DB
    Postgres DB is a powerful, open-source relational database that can be used as a chat memory provider for managing chat history.

  • File Memory
    File memory is a simple file-based storage solution for managing chat history. It persists chat data within the MI resources, storing the memory file at <MI_SERVER_HOME>/registry/governance/ai/chat-memory/<CONNECTION_NAME>.json.

    Info

    File memory is designed specifically for testing purposes and is not recommended for production use. Please use other chat memory providers for production scenarios.

Connection Configuration Parameters

The connection configuration parameters are used to establish a connection with the chat memory database. These parameters are needed to provide based on the database you are using.

Postgres DB

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes
host Host Host name of the Postgres DB. Yes
port Port Port number of the Postgres DB. Yes
database Database Name of the database in Postgres DB. Yes
user User User name to connect to the Postgres DB. Yes
password Password Password to connect to the Postgres DB. Yes
table Table Name of the table in Postgres DB. Yes

File Memory

Parameter Name Display Name Description Required
name Connection Name Name of the connection. Yes

Operations

chat

The chat operation sends a message to the LLM and receives a response.
Connection Parameters:

Parameter Name Connection Category Description Required
llmConfigKey LLM Connection Connection to the LLM provider. Refer to the LLM Connection section for more details. Yes
memoryConfigKey Chat Memory Connection Connection to the chat memory provider. Refer to the Chat Memory Connection section for more details. No

Operation Parameters:
Parameter Name Display Name Description Required
modelName Model Name Name of the model to use for the chat operation. Yes
sessionId Session ID Unique identifier for the chat session. No
prompt User Query/Prompt Message to send to the LLM. Yes
attachments Attachments Attachments to send along with the message to the LLM. The attachments should be in the following format:
[{"type":"application/pdf", "content":"base64 content"}].
Supported types include:
  • application/pdf
  • image/png
  • image/jpeg
  • text/plain
  • text/html
  • text/csv
  • text/xml
The content must be base64 encoded.
No
outputType Output Type Type of output to return. The supported types are: string, integer, float, boolean. The default is string. No
maxHistory Max chat history Maximum number of chat history entries to send to the LLM. No
system System Prompt System prompt to set the context for the conversation. No
maxTokens Max Tokens Maximum number of tokens to generate in the response. No
temperature Temperature Sampling temperature to use for the response. The default is 0.7. No
topP Top P Top P sampling to use for the response. The default is 1. No
frequencyPenalty Frequency Penalty Frequency penalty to use for the response. The default is 0. No
seed Seed Specifies the random seed value to ensure reproducibility of the response. The default value is 0. No
responseVariable Output Variable Name Name of the variable to store the response. Yes
overwriteBody Overwrite Body Whether to overwrite the message body with the response. The default is false. No

Sample configuration

<ai.chat>
    <connections>
        <llmConfigKey>LLM_CONN</llmConfigKey>
        <memoryConfigKey>MEMORY_CONN</memoryConfigKey>
    </connections>
    <sessionId>{${payload.sessionId}}</sessionId>
    <system>You are a helpful AI assistant</system>
    <prompt>${payload.content}</prompt>
    <outputType>string</outputType>
    <responseVariable>ai_chat_1</responseVariable>
    <overwriteBody>false</overwriteBody>
    <modelName>gpt-4o</modelName>
    <temperature>0.7</temperature>
    <maxTokens>4069</maxTokens>
    <topP>1</topP>
    <frequencyPenalty>0</frequencyPenalty>
    <maxHistory>10</maxHistory>
</ai.chat>

Sample response

The response received will be stored in the variable ai_chat_1 as a JSON object. The following is a sample response.

{
"content": "WSO2 Micro Integrator is a comprehensive integration solution designed to simplify digital transformation. It facilitates connectivity among applications, services, data, and the cloud through a user-friendly, low-code graphical design experience. The Micro Integrator offers deployment options in both microservices and ESB styles, providing greater flexibility.",
"tokenUsage": {
    "inputTokensDetails": {
    "cachedTokens": 0
    },
    "outputTokensDetails": {
    "reasoningTokens": 0
    },
    "inputTokenCount": 41,
    "outputTokenCount": 459,
    "totalTokenCount": 500
},
"sources": [],
"finishReason": "STOP",
"toolExecutions": []
}
ragChat

The ragChat operation sends a message to the LLM and retrieves a response, leveraging a Retrieval-Augmented Generation (RAG) approach.
Connection Parameters:

Parameter Name Connection Category Description Required
embeddingConfigKey Embedding model Connection Connection to the embedding model provider. Refer to the Embedding Model Connection section for more details. Yes
vectorStoreConfigKey Vector Store Connection Connection to the vector store provider. Refer to the Vector Store Connection section for more details. Yes
llmConfigKey LLM Connection Connection to the LLM provider. Refer to the LLM Connection section for more details. Yes
memoryConfigKey Memory Connection Connection to the chat memory provider. Refer to the Chat Memory Connection section for more details. No

Operation Parameters:

Parameter Name Display Name Description Required
sessionId Session ID Unique identifier for the chat session. No
prompt User Query/Prompt Message to send to the LLM. Yes
embeddingModel Embedding Model Name of the embedding model to use for the RAG operation. Yes
maxResults Max Results Maximum number of results to retrieve from the vector store. No
minScore Min Score Minimum score threshold for the results retrieved from the vector store. No
modelName Model Name Name of the model to use for the chat operation. Yes
outputType Output Type Type of output to return. The supported types are: string, integer, float, boolean. The default is string. No
maxHistory Max chat history Maximum number of chat history entries to send to the LLM. No
system System Prompt System prompt to set the context for the conversation. No
maxTokens Max Tokens Maximum number of tokens to generate in the response. No
temperature Temperature Sampling temperature to use for the response. The default is 0.7. No
topP Top P Top P sampling to use for the response. The default is 1. No
frequencyPenalty Frequency Penalty Frequency penalty to use for the response. The default is 0. No
seed Seed Specifies the random seed value to ensure reproducibility of the response. The default value is 0. No
responseVariable Output Variable Name Name of the variable to store the response. Yes
overwriteBody Overwrite Body Whether to overwrite the message body with the response. The default is false. No

Sample configuration

<ai.ragChat>
    <connections>
        <llmConfigKey>LLM_CONN</llmConfigKey>
        <memoryConfigKey>MEMORY_CONN</memoryConfigKey>
        <embeddingConfigKey>EMBEDDING_CONN</embeddingConfigKey>
        <vectorStoreConfigKey>VECTOR_STORE_CONN</vectorStoreConfigKey>
    </connections>
    <sessionId>{${payload.sessionId}}</sessionId>
    <prompt>${payload.content}</prompt>
    <outputType>string</outputType>
    <responseVariable>ai_ragChat_1</responseVariable>
    <overwriteBody>false</overwriteBody>
    <embeddingModel>text-embedding-3-small</embeddingModel>
    <maxResults>5</maxResults>
    <minScore>0.75</minScore>
    <modelName>gpt-4o</modelName>
    <temperature>0.7</temperature>
    <maxTokens>4069</maxTokens>
    <topP>1</topP>
    <frequencyPenalty>0</frequencyPenalty>
    <seed>0</seed>
    <system></system>
    <maxHistory>10</maxHistory>
</ai.ragChat>

Sample response

The response received will be stored in the variable ai_ragChat_1 as a JSON object. The following is a sample response.

{
"content": "WSO2 Micro Integrator is a comprehensive integration solution designed to simplify digital transformation. It facilitates connectivity among applications, services, data, and the cloud through a user-friendly, low-code graphical design experience. The Micro Integrator offers deployment options in both microservices and ESB styles, providing greater flexibility.",
"tokenUsage": {
    "inputTokensDetails": {
    "cachedTokens": 0
    },
    "outputTokensDetails": {
    "reasoningTokens": 0
    },
    "inputTokenCount": 99,
    "outputTokenCount": 62,
    "totalTokenCount": 161
},
"sources": [
    {
    "textSegment": {
        "text": "WSO2 Micro Integrator is a comprehensive integration solution that simplifies your digital transformation journey. The Micro Integrator streamlines connectivity among applications, services, data, and the cloud using a user-friendly, low-code graphical design experience. Deployment options include both microservices and ESB styles for greater flexibility.",
        "metadata": {
        "index": "0"
        }
    },
    "metadata": {}
    }
],
"finishReason": "STOP",
"toolExecutions": []
}
agent

The chat operation sends a message to the LLM and receives a response.
Connection Parameters:

Parameter Name Connection Category Description Required
llmConfigKey LLM Connection Connection to the LLM provider. Refer to the LLM Connection section for more details. Yes
memoryConfigKey Chat Memory Connection Connection to the chat memory provider. Refer to the Chat Memory Connection section for more details. No

Operation Parameters:
Parameter Name Display Name Description Required
sessionId Session ID Unique identifier for the chat session. No
modelName Model Name Name of the model to use for the chat operation. Yes
role Role The specific function or responsibility assigned to the agent, defining its purpose and scope of operation. No
instructions Instructions Specific instructions or guidelines for the agent to follow during its operation. No
prompt User Query/Prompt Message to send to the LLM. Yes
attachments Attachments Attachments to send along with the message to the LLM. The attachments should be in the following format:
[{"type":"application/pdf", "content":"base64 content"}].
Supported types include:
  • application/pdf
  • image/png
  • image/jpeg
  • text/plain
  • text/html
  • text/csv
  • text/xml
The content must be base64 encoded.
No
outputType Output Type Type of output to return. The supported types are: string, integer, float, boolean. The default is string. No
maxHistory Max chat history Maximum number of chat history entries to send to the LLM. No
system System Prompt System prompt to set the context for the conversation. No
maxTokens Max Tokens Maximum number of tokens to generate in the response. No
temperature Temperature Sampling temperature to use for the response. The default is 0.7. No
topP Top P Top P sampling to use for the response. The default is 1. No
frequencyPenalty Frequency Penalty Frequency penalty to use for the response. The default is 0. No
seed Seed Specifies the random seed value to ensure reproducibility of the response. The default value is 0. No
responseVariable Output Variable Name Name of the variable to store the response. Yes
overwriteBody Overwrite Body Whether to overwrite the message body with the response. The default is false. No

Sample configuration

<ai.agent>
    <connections>
        <llmConfigKey>LLM_CONN</llmConfigKey>
        <memoryConfigKey>MEMORY_CONN</memoryConfigKey>
    </connections>
    <sessionId>{${payload.sessionId}}</sessionId>
    <role>Customer Assistance Agent</role>
    <instructions>Assist customers by providing accurate and helpful responses to their queries, ensuring a positive user experience.</instructions>
    <prompt>${payload.content}</prompt>
    <outputType>string</outputType>
    <responseVariable>ai_agent_1</responseVariable>
    <overwriteBody>false</overwriteBody>
    <modelName>gpt-4o</modelName>
    <temperature>0.7</temperature>
    <maxTokens>4069</maxTokens>
    <topP>1</topP>
    <frequencyPenalty>0</frequencyPenalty>
    <maxHistory>10</maxHistory>
</ai.agent>

Sample response

The response received will be stored in the variable ai_agent_1 as a JSON object. The following is a sample response.

{
"content": "Hello John Doe! I can help you explore various investment options offered by PineValley Bank. Since your investment goal is long-term growth, let's consider some of the products that might align with your objectives:\n\n1. **Qtrade Guided Portfolios**: These are professionally managed portfolios designed to help achieve long-term growth with a balanced risk approach. They might be suitable for you if you prefer a hands-off approach but still want expert management.\n\n2. **TFSAs (Tax-Free Savings Accounts)**: A TFSA allows your investments to grow tax-free, which is beneficial for long-term growth. You can invest in various assets within a TFSA, including stocks and ETFs.\n\n3. **RRSPs (Registered Retirement Savings Plans)**: An RRSP provides tax-deferred growth, which can be advantageous for long-term retirement planning. Contributions are tax-deductible, and the funds grow without being taxed until withdrawal.\n\n4. **Direct Investing (Stocks, ETFs)**: If you prefer more control over your investments, you can consider direct investing in stocks and ETFs. This approach requires more involvement but can be rewarding if you want to actively manage your portfolio.\n\n5. **GICs (Guaranteed Investment Certificates)**: While generally more conservative, GICs offer guaranteed returns, making them a stable component of a diversified portfolio.\n\nSince I couldn't retrieve detailed brochures or documents this time, I recommend considering these options and contacting PineValley Bank for specific product brochures and more detailed guidance. If you have any questions about these products or need help with the onboarding process, feel free to ask!",
"tokenUsage": {
    "inputTokenCount": 1995,
    "outputTokenCount": 374,
    "totalTokenCount": 2369
},
"finishReason": "STOP",
"toolExecutions": [
    {
    "request": {
        "id": "call_4MZwbJiAOJPUuuJFpcKSXneV",
        "name": "http_post_tool_0",
        "arguments": "{\"requestBodyJson\":\"{\\\"userID\\\": \\\"C567\\\"}\"}"
    },
    "result": "{\"userID\":\"C567\",\"name\":\"John Doe\",\"age\":30,\"investmentGoal\":\"Long-term growth\"}"
    },
    {
    "request": {
        "id": "call_XBFZADsSZsSzl3UBArT60H1O",
        "name": "ai_getFromKnowledge_tool_0",
        "arguments": "{\"input\":\"investment products offered by PineValley Bank\"}"
    },
    "result": "[]"
    }
]
}
addToKnowledge

The ragChat operation sends a message to the LLM and retrieves a response, leveraging a Retrieval-Augmented Generation (RAG) approach.
Connection Parameters:

Parameter Name Connection Category Description Required
embeddingConfigKey Embedding model Connection Connection to the embedding model provider. Refer to the Embedding Model Connection section for more details. Yes
vectorStoreConfigKey Vector Store Connection Connection to the vector store provider. Refer to the Vector Store Connection section for more details. Yes

Operation Parameters:
Parameter Name Display Name Description Required
input Input Input to be added to the knowledge base. Yes
needParse Parse Whether to parse the input. No
parseType Type Type of parsing to be done. The supported types are: pdf-to-text, markdown-to-text, html-to-text, doc-to-text, docx-to-text, xls-to-text, xlsx-to-text, ppt-to-text, pptx-to-text. The default is pdf-to-text. No
needSplit Split Whether to split the input into smaller chunks. No
splitStrategy Strategy Strategy to be used for splitting the input. The supported strategies are: Recursive, ByParagraph, BySentence. The default is Recursive. No
maxSegmentSize Max Segment Size Maximum size of each segment after splitting. The default is 1000. No
maxOverlapSize Max Overlap Size Maximum overlap size between segments. The default is 200. No
needEmbedding Embedding Whether to generate embeddings for the input. No
embeddingModel Embedding Model Name of the embedding model to use for the RAG operation. Yes
responseVariable Output Variable Name Name of the variable to store the response. Yes
overwriteBody Overwrite Body Whether to overwrite the message body with the response. The default is false. No

Sample configuration

<ai.addToKnowledge>
    <connections>
        <embeddingConfigKey>EMBEDDING_CONN</embeddingConfigKey>
        <vectorStoreConfigKey>VECTOR_STORE_CONN</vectorStoreConfigKey>
    </connections>
    <input>{${payload.input}}</input>
    <needParse>true</needParse>
    <parseType>pdf-to-text</parseType>
    <needSplit>true</needSplit>
    <splitStrategy>Recursive</splitStrategy>
    <maxSegmentSize>1000</maxSegmentSize>
    <maxOverlapSize>200</maxOverlapSize>
    <needEmbedding>true</needEmbedding>
    <embeddingModel>text-embedding-3-small</embeddingModel>
    <responseVariable>ai_addToKnowledge_1</responseVariable>
    <overwriteBody>false</overwriteBody>
</ai.addToKnowledge>

Sample response

The response received will be stored in the variable ai_addToKnowledge_1 as a JSON object. The following is a sample response.

{
"success": true
}
getFromKnowledge

The ragChat operation sends a message to the LLM and retrieves a response, leveraging a Retrieval-Augmented Generation (RAG) approach.
Connection Parameters:

Parameter Name Connection Category Description Required
embeddingConfigKey Embedding model Connection Connection to the embedding model provider. Refer to the Embedding Model Connection section for more details. Yes
vectorStoreConfigKey Vector Store Connection Connection to the vector store provider. Refer to the Vector Store Connection section for more details. Yes

Operation Parameters:
Parameter Name Display Name Description Required
input Input Input to be searched from the knowledge base. Yes
needEmbedding Embed Input Specifies whether to generate embeddings for the input. If the embedding vector is already provided by the user, this can be disabled. However, if the input is a raw string, embeddings must be generated. No
embeddingModel Embedding Model Name of the embedding model to use for the RAG operation. Yes
maxResults Max Results Maximum number of results to retrieve from the vector store. No
minScore Min Score Minimum score threshold for the results retrieved from the vector store. No
responseVariable Output Variable Name Name of the variable to store the response. Yes
overwriteBody Overwrite Body Whether to overwrite the message body with the response. The default is false. No

Sample configuration

<ai.getFromKnowledge>
    <connections>
        <embeddingConfigKey>EMBEDDING_CONN</embeddingConfigKey>
        <vectorStoreConfigKey>VECTOR_STORE_CONN</vectorStoreConfigKey>
    </connections>
    <input>{${payload.input}}</input>
    <needEmbedding>true</needEmbedding>
    <embeddingModel>text-embedding-3-small</embeddingModel>
    <maxResults>5</maxResults>
    <minScore>0.75</minScore>
    <responseVariable>ai_getFromKnowledge_1</responseVariable>
    <overwriteBody>false</overwriteBody>
</ai.getFromKnowledge>

Sample response

The response received will be stored in the variable email_list_1 as a JSON object. The following is a sample response.

[
{
    "score": 0.9066269765264976,
    "embeddingId": "1fd2ccb4-4317-4c3e-a598-1cbcce4ae5ab",
    "embedding": [
        ...
    ],
    "embedded": {
    "text": "WSO2 Micro Integrator is a comprehensive integration solution that simplifies your digital transformation journey. The Micro Integrator streamlines connectivity among applications, services, data, and the cloud using a user-friendly, low-code graphical design experience. Deployment options include both microservices and ESB styles for greater flexibility.",
    "metadata": {
        "index": "0"
    }
    }
}
]

Click on the Go to Tutorial button below to learn how to build your first AI integration using the above operations. The tutorial will guide you through the process of creating a simple integration that utilizes the AI capabilities of WSO2 Micro Integrator.