Core Concepts

Chunking

Large text inputs are automatically split into optimal chunks before embedding.

Why chunking matters

Embedding models have token limits
Smaller chunks improve retrieval precision
Long documents retrieve only relevant sections

How it works

Text is split on semantic boundaries (sentences, paragraphs)
Chunks are sized for the embedding model (~512 tokens)
Overlap preserves context across chunk boundaries
Each chunk is embedded and stored with a reference to the parent memory

You don't configure this

Chunking is fully automatic. Send any text length and Databaset handles the rest.

PreviousApps Next Retrieval

Edit this page on GitHub