Nutch

Nutch is open source web-search software. It builds on Lucene, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.

Supporting Organization: 
Apache Software Foundation
Your rating: None Average: 3 (1 vote)
Login or register to tag items