|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||
See:
Description
| Plugins | |
|---|---|
| org.apache.nutch.analysis.lang | Text document language identifier. |
| org.apache.nutch.indexer.basic | A basic indexing plugin. |
| org.apache.nutch.indexer.more | A more indexing plugin. |
| org.apache.nutch.parse.html | An HTML document parsing plugin. |
| org.apache.nutch.parse.js | |
| org.apache.nutch.parse.msword | A Word document parsing plugin. |
| org.apache.nutch.parse.msword.chp | |
| org.apache.nutch.parse.pdf | A pdf parsing plugin. |
| org.apache.nutch.parse.text | A plain text parsing plugin. |
| org.apache.nutch.protocol.file | Protocol plugin which supports retrieving local file resources. |
| org.apache.nutch.protocol.ftp | Protocol plugin which supports retrieving documents via the ftp protocol. |
| org.apache.nutch.protocol.http | Protocol plugin which supports retrieving documents via the http protocol. |
| org.apache.nutch.protocol.httpclient | Protocol plugin which supports retrieving documents via the HTTP protocol. |
| org.creativecommons.nutch | Sample plugins that parse and index Creative Commons medadata. |
Nutch is the open-source search engine.
|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||