Custom Metadata Filtering

By default, Caddie uses the Agent Access Controls to grant an Agent access to the files within a particular folder. As a result, by granting a user access to an Agent, you grant them the ability to ask questions about any document the Agent has access to.

However, you might have access controls in place to restrict which documents certain users can access. Proper access controls can be complicated and tedious to implement, and are essential for ensuring data privacy and user trust. So, to maintain the consistent application of your existing access controls, Caddie supports applying them directly within an AI chat session.

If you want to restrict which files can be used to answer a specific user's AI chat request, you can use Custom Metadata Filtering to do so. Custom Metadata Filtering allows you to configure an HTTP endpoint for Caddie to make requests to. This endpoint should return properly formatted document filters (see more details in Requirements for your ACL endpoint), and Caddie will apply those filters at chat-time when the Agent searches for relevant documents, ensuring that your access controls are respected.

Making chat requests using Custom Metadata Filtering

To apply your own access controls to the Agent's document search, provide whatever fields or data you need to determine which documents should be available in the chat.

There are two different general approaches you can use to applying Custom Metadata Filtering.

First, you can provide user-specific data in the metadata field in your chat request, and use this data to generate a list of document names or ids to filter.
Second, you can create custom file metadata when adding files, and use these data fields to apply filters.

Let's explore these two approaches with a hypothetical example. Say you are making a chat request on behalf of the user John Doe. You have uploaded two documents: "GeneralTestDoc.pdf" and "TeamTestDoc.pdf". You want all users to be able to access GeneralTestDoc.pdf but you only want users that belong to the team "TestTeam" to be able to access TestDoc.pdf. Using

Approach 1: Generating a document access control list for chat user

You can provide the team that John Doe belongs to as a metadata field in the chat request, and it will be referred directly to your access control endpoint. Here's an example request:

fetch('http://localhost:3000/api/chat', {
	method: "POST",
	headers: {
		'X-API-KEY': '',
		'X-Organization-Id': ''
	},
	body: JSON.stringify({
		agentId: '',
		message: {
			content: "chat prompt from John Doe"
		},
    impersonatedUser: {
			email: "john-doe@email.com"
		},
		metadata: { team: 'team-john-doe-belongs-to' }
	})
});

If you make this request, and have configured your endpoint (Configuring your access control endpoint), Caddie will send the following request to your endpoint:

fetch('url/of/your/acl/endpoint', {
  method: 'POST',
  body: JSON.stringify({
    metadata: {
      team: 'team-john-doe-belongs-to',
    },
  }),
});

In response, your endpoint could evaluate John Doe's team, and optionally respond with a list of files (either ["GeneralTestDoc.pdf" and "TeamTestDoc.pdf"], or just ["GeneralTestDoc.pdf"])

Approach 2: Using custom file metadata

In addition to passing metadata in your chat request, you can utilize the metadata fields sent when adding files. Continuing the example above, you could upload each file with a "team" attribute in the custom metadata, setting "team" to "any" for GeneralTestDoc.pdf and to "TestTeam" for TeamTestDoc.pdf. Here's what those upload API requests would look like:

// GeneralTestDoc
fetch('http://localhost:3000/api/file', {
  body: {
    "name": "GeneralTestDoc.pdf",
  	"metadata": {
		  "team": "any"
	  }
  }
});

// TeamTestDoc
fetch('http://localhost:3000/api/file', {
  body: {
    "name": "TeamTestDoc.pdf",
  	"metadata": {
		  "team": "TestTeam"
	  }
  }
});

Then, in your chat request, include John Doe's email. In your access control endpoint, you can use John Doe's email to look up which teams he belongs to, and use that existing access control to inform your response. Chat request:

fetch('http://localhost:3000/api/chat', {
	method: "POST",
	headers: {
		'X-API-KEY': '',
		'X-Organization-Id': ''
	},
	body: JSON.stringify({
		agentId: '',
		message: {
			content: "chat prompt from John Doe"
		},
    impersonatedUser: {
			email: "john-doe@email.com"
		},
		metadata: { userEmail: "john-doe@email.com" }
	})
});

In response, your endpoint could evaluate John Doe's team and respond with filters allowing access only to files where team = "any" (if John Doe is not a member of TestTeam and doesn't have access to TestTeamDoc.pdf), or to files where team = "any" or team = "TestTeam" (if John Doe is a member of TestTeam, and does have access to TestTeamDoc.pdf).

Requirements for your access control endpoint

Since the access control endpoint you configure will reference your existing access controls, the responsibility to set up this endpoint and to ensure that it is robust, accessible, and implemented correctly is your own. Your endpoint must satisfy the following requirements:

It must be accessible to Caddie. If you want to secure this endpoint using a Bearer token, access key, or other authentication method, contact Tiber Software support for help with configuration and integration of your endpoint.
It must accept POST requests from Caddie with the following JSON schema. Caddie includes the organizationId and userId by default, and also includes your metadata under the "metadata" object:

{
  "organizationId": "", // organization that the chat agent belongs to
  "userId": "", // chat user, will be impersonated user's id if impersonating user in chat request
  "metadata": {} // exact same metadata fields provided in chat API request
}

It must return a valid AWS RetrievalFilter object in response. For documentation on the structure of this JSON object, see:
- AWS Documentation for RetrievalFilter
- AWS Documentation for Retrieve commands

Configuring your access control endpoint

Endpoint config JSON

Provide Caddie the URL of your endpoint and any authentication configuration parameters by modifying the "customMetadataFilterConfig" attribute of your organization. It is a JSON object that should be set as follows:

{
  "url": "path/to/your/access/control/endpoint"
}

This can be configured through the Admin Dashboard by navigating to Settings > Advanced and editing the Custom Metadata Filtering Config.

To set the config using an API call, provided the stringified JSON to the /organization/update endpoint. You must have app-level admin permissions to perform this operation.

fetch('http://localhost:3000/api/auth/organization/update', {
	method: "POST",
	headers: {
		'X-API-KEY': '',
	},
	body: JSON.stringify({
    organizationId: 'id-of-organization-you-want-to-configure',
    data: {
      customMetadataFilterConfig: '{"url": "path/to/your/access/control/endpoint"}' //set this value to JSON string
    }
  })
});

Update agent to enable Custom Metadata Filtering

By default, Agents use the folder system in the Knowledgebase for access control. You must enable Custom Metadata Filtering to use it instead.

To enable it in the Admin dashboard, navigate to Agents and select your agent. Navigate to the Knowledgebase tab, and edit the Knowledge Base Settings, toggling the Custom Metadata Filtering on.

You can also update the Agent using an API call, setting the field isCustomMetadataFilteringEnabled to true:

fetch('http://localhost:3000/api/agent/id-of-agent', {
  method: "PATCH",
  headers: {
    'X-API-KEY': '',
    'X-Organization-Id': ''
	},
  body: JSON.stringify({
    isCustomMetadataFilteringEnabled: true
  })
});

PreviousAdding and Removing Files

NextIntroduction

Developer Docs

Getting Started

Organizations and Agents

Chat

Data Integration

User Docs