Aleph - Tool for indexing large amounts of both documents (PDF, Word, HTML) and structured (CSV, XLS, SQL) data for easy browsing and search. It is built with investigative reporting as a primary use case. (Demo, Source Code) MITDocker/K8S
Apache Solr - Enterprise search platform featuring full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, and rich document (e.g., Word, PDF) handling. (Source Code) Apache-2.0Java/Docker/K8S
Jina - Cloud-native neural search framework for any kind of data. Apache-2.0Python/Docker
Manticore Search - Full-text search and data analytics, with fast response time for small, medium and big data (alternative to Elasticsearch). GPL-3.0Docker/deb/C++/K8S
OpenSearch - Distributed and RESTful search engine. (Source Code) Apache-2.0Java/Docker/K8S/deb
SearXNG⚠ - Internet metasearch engine which aggregates results from various search services and databases (Fork of Searx). (Source Code) AGPL-3.0Python/Docker
sist2 - Lightning-fast file system indexer and search tool. GPL-3.0C/Docker
Sosse - Selenium based search engine and crawler with offline archiving. (Source Code) AGPL-3.0Python/Docker
Typesense - Blazing fast, typo-tolerant open source search engine optimized for developer happiness and ease of use. (Source Code) GPL-3.0C++/Docker/K8S/deb
Websurfx⚠ - Aggregate results from other search engines (metasearch engine) without ads while keeping privacy and security in mind. It is extremely fast and provides a high level of customization (alternative to SearX). AGPL-3.0Rust/Docker
Whoogle⚠ - A self-hosted, ad-free, privacy-respecting metasearch engine. MITPython