Creating Collections

Collections are the top-level containers for your knowledge base documents. Learn how to create, configure, and manage collections for optimal RAG performance.

app.8bit-ai.com
Create Knowledge Base

Create & Configure

Set up collections with names, descriptions, and processing preferences.

Multiple Formats

Upload PDF, DOCX, TXT, MD, HTML, and CSV documents with automatic parsing.

Flexible Upload

Use dashboard, API, URL fetching, or raw text for adding documents.

Creating a Collection

A collection acts as a logical grouping of related documents. You might create separate collections for product documentation, internal policies, FAQ data, or API references. Each collection has its own processing configuration and can be linked to multiple agents.

1

Name and Description

Choose a descriptive name that reflects the collection's content, such as "Product Documentation v3" or "Customer Support FAQ". Add a description to help team members understand the collection's purpose and scope.

2

Processing Configuration

Set chunk size, chunk overlap, and embedding model for document processing. These settings affect how your documents are split and indexed. You can adjust them later and reindex documents if needed.

3

Agent Linking

After creation, link the collection to one or more agents. An agent can reference multiple collections, and a collection can be shared across multiple agents. This enables flexible knowledge sharing across your AI deployments.

Supported File Formats

8bit-ai supports a wide range of document formats with automatic text extraction and parsing. Each format is processed differently to preserve structure and meaning.

FormatExtensionProcessing NotesMax Size
PDF.pdfText extraction preserves headings, tables, and basic formatting50 MB
Word.docxFull text, heading styles, and list structures preserved50 MB
Plain Text.txtFast processing, no structure preservation needed10 MB
Markdown.mdHeadings, code blocks, and lists used for chunk boundary detection25 MB
HTML.html, .htmStrips tags, preserves semantic structure from headings25 MB
CSV.csvRows converted to structured text with column headers25 MB

Format Recommendations

For best results, use Markdown or well-structured PDFs with clear heading hierarchies. These formats allow the chunking algorithm to respect document structure, resulting in more coherent and context-rich chunks for retrieval.

Upload Methods

8bit-ai provides multiple ways to add documents to your collections, supporting different workflows and automation scenarios.

File Upload

Upload files directly through the dashboard or API. Supports drag-and-drop and multi-file uploads. Files are stored securely and processed automatically.

URL Fetching

Provide a URL and the system will fetch, parse, and index the content. Useful for documentation sites, blog posts, and public web pages.

Raw Text

Paste or send raw text content directly. Ideal for quick additions, notes, or content that doesn't exist as a file.

Managing Collections

Once created, collections can be updated, archived, or deleted. You can also monitor their processing status and document counts.

Collection Settings

Update the name, description, or processing configuration of a collection at any time. Changing chunk size or embedding model will require reindexing existing documents.

Data Loss Warning

Deleting a collection permanently removes all documents, vector embeddings, and processing data. This action cannot be undone. Consider archiving instead of deleting if you might need the collection in the future.