Splitters detect logical boundaries so you can process long documents in parallel.
Create a splitter
- In the Studio, go to Processors → New → Document Splitting.
- Choose a strategy:
- Layout-based for PDFs with clear page breaks.
- Semantic for text-heavy documents using embeddings.
- Upload at least three examples so the model understands typical structure.
Rules & tags
- Add page range rules to force splits (e.g., every 10 pages for statements).
- Define section tags like
introduction,appendix, orinvoiceto enrich outputs. - Combine with workflows to trigger different extractors per section.
API deployment
curl -X POST https://api.algorythmos.fr/processors \
-H "x-api-key: $ALG_KEY" \
-H "Content-Type: application/json" \
-d '{
"type":"splitter",
"name":"policy-splitter",
"config":{"strategy":"semantic","min_confidence":0.8}
}'Publish the splitter and wire it into your workflow to distribute downstream processors automatically.