Attachment full-text indexing
The Domino® server and Notes® standard client use Apache Tika 1.18 open source conversion filters to extract text for full-text searches of attachments. Tika replaces the KeyView conversion filter used in previous releases.
- Search a wide range of formats, including container files such as .zip and .tar files.
- Search ASCII text files that contain UTF-8 encoding.
- Customize which attachment types can be full-text indexed and the maximum attachment size allowed for full-text indexing.
Tika runs as a Java™ process when you start the Notes® standard client or Domino®. The process calls tika-server.jar, which starts the HTTP task and listens for text extraction requests on port 9998, by default. If you upgrade to Notes® or Domino® 10, full-text indexes that previously used KeyView filters to extract text are rebuilt using the Tika filters.
For the list of file formats supported by Tika 1.18, see the Apache Tika web site.
TIKA_PORT=9997
The Notes® basic client does not use Tika filters for attachment searches in local databases. (Limitation does not apply to the Notes standard client or to searches of server-based databases). The Notes® basic client users can choose to index attachments but only ASCII text attachments are searched.