Search Engine

Open source search engines have been in the past a poor parent of the open source community. But for the last year, especially thanks to the dynamism of the Apache Lucene project, many industrial solutions are emerging. On top of the traditional multi formats indexers, there are now solutions for advanced features like clustering.

OpenPipeline

OpenPipeline is new open source software for crawling, parsing, analyzing and routing documents. It ties together otherwise incomplete solutions for enterprise search and document processing. OpenPipeline provides a common architecture for connectors to data sources, file filters, text analyzers and modules to distribute documents across a network. It includes a job scheduler and a full UI with a point-and-click interface.

Your rating: None Average: 2 (1 vote)

OpenPipeline

OpenPipeline is new open source software for crawling, parsing, analyzing and routing documents. It ties together otherwise incomplete solutions for enterprise search and document processing. OpenPipeline provides a common architecture for connectors to data sources, file filters, text analyzers and modules to distribute documents across a network. It includes a job scheduler and a full UI with a point-and-click interface.

Your rating: None Average: 2 (1 vote)

Google Search Appliance (GSA)

Dieselpoint Search™ is search and navigation software for enterprise data including document collections, databases, and XML.The Google Search Appliance delivers highly relevant, fast, easy-to-use search results to users across all corporate content – including file servers, web servers, document and content management systems, and enterprise applications.

Your rating: None Average: 4 (1 vote)

Velocity Search Platform

Velocity Search Platform is the Enterprise Search product of Vivisimo. It offers mapped security, entity extraction and many of the features of modern search engine.

Your rating: None Average: 5 (1 vote)

Apache Lucene Java

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Your rating: None Average: 4.5 (2 votes)

Xapian

Xapian is a highly adaptable toolkit which allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also supports a rich set of boolean query operators.

Your rating: None Average: 3 (1 vote)

Nutch

Nutch is open source web-search software. It builds on Lucene, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.

Your rating: None Average: 3 (1 vote)

ht://dig

The ht://Dig system is a complete open source search engine for a domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Google and AltaVista. Instead it is meant to cover the search needs for a single company, campus, or even a particular sub section of a web site.
As opposed to some WAIS-based or web-server based search engines, ht://Dig can easily span several web servers. The type of these different web servers doesn't matter as long as they understand common protocols like HTTP.

Your rating: None Average: 2 (1 vote)

Egothor

Egothor is an Open Source, high-performance, full-featured text search engine written entirely in Java. It is technology suitable for nearly any application that requires full-text search, especially cross-platform. It can be configured as a standalone engine, metasearcher, peer-to-peer HUB, and, moreover, it can be used as a library for an application that needs full-text search.

Your rating: None Average: 2 (1 vote)

Apache Solr

Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface.

Your rating: None Average: 4.7 (3 votes)
Syndicate content