kreuzberg-dev/kreuzberg
VerifiedFast extraction of text, metadata, and code insights from many formats.
What is kreuzberg-dev/kreuzberg?
Kreuzberg is an open-source tool that extracts text, metadata, and code intelligence from a wide range of documents and source files. It processes content at native speeds across numerous languages and formats while remaining lightweight.
The system relies on tree-sitter for semantic analysis of code, returning structured results such as functions, classes, and imports. Plugins allow extension for specialized tasks like OCR backends or custom validators.
Developers and data teams use it when they need reliable parsing in production pipelines or research workflows that span multiple programming ecosystems.
Capabilities
What you can build with kreuzberg-dev/kreuzberg
Codebase Analysis
Scan large repositories to extract symbols and docstrings for documentation or refactoring tools.
Document Processing Pipelines
Convert mixed-format files into clean text and metadata for downstream search or indexing systems.
Multi-Language Tooling
Build cross-language utilities that need consistent extraction results from Rust, Python, JavaScript, and more.
Install kreuzberg-dev/kreuzberg
pip install kreuzberg/plugin marketplace add kreuzberg-dev/plugins
/plugin install kreuzberg@kreuzberg- 1Choose the binding for your language from the project repository.
- 2Install via the package manager shown for that binding, such as pip or cargo.
- 3Import the library in your code and point it at a target file or directory.
- 4Call the extraction function and inspect the returned metadata and code intelligence.
- 5Extend behavior by registering custom plugins if additional processing is required.
kreuzberg-dev/kreuzberg: pros & cons
Pros
- +Broad coverage of 96 formats and 306 languages in one library
- +Native performance without GPU hardware
- +Consistent API across many language bindings
- +Plugin system for adding OCR or validation logic
Cons
- –Elastic license may restrict certain commercial uses
- –Initial setup varies by language binding chosen
- –Full feature set requires installing language-specific dependencies
Frequently asked questions
No, it runs at native speeds on standard CPU hardware.
User reviews
Verified reviews from the community shape this listing's rating.
Loading reviews…