CmsExtractorOpenOffice (OpenCms Core API, version 7.5.3)

Overview

Package

Class

Deprecated

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.opencms.search.extractors
Class CmsExtractorOpenOffice

java.lang.Object
  org.opencms.search.extractors.A_CmsTextExtractor
      org.opencms.search.extractors.CmsExtractorOpenOffice

All Implemented Interfaces:: I_CmsTextExtractor

public final class CmsExtractorOpenOffice
extends A_CmsTextExtractor
extends A_CmsTextExtractor

Extracts the text from OpenOffice documents (.ods, .odf).

Since:: 7.0.4
Version:: $Revision: 1.7 $
Author:: Dirk Oelkers

Field Summary

Fields inherited from class org.opencms.search.extractors.A_CmsTextExtractor
`m_inputBuffer`

Method Summary
`I_CmsExtractionResult`	`extractText(java.io.InputStream in, java.lang.String encoding)` Extracts the text and meta information from the document on the input stream, using the specified content encoding.
`static I_CmsTextExtractor`	`getExtractor()` Returns an instance of this text extractor.

Methods inherited from class org.opencms.search.extractors.A_CmsTextExtractor
`combineContentItem, extractText, extractText, extractText, getStreamCopy, removeControlChars`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Method Detail

getExtractor

public static I_CmsTextExtractor getExtractor()

Returns an instance of this text extractor.

Returns:: an instance of this text extractor

extractText

public I_CmsExtractionResult extractText(java.io.InputStream in,
                                         java.lang.String encoding)
                                  throws java.lang.Exception

Description copied from interface: I_CmsTextExtractor

Extracts the text and meta information from the document on the input stream, using the specified content encoding.

The encoding is a hint for the text extractor, if the value given is null then the text extractor should try to figure out the encoding itself.

Specified by:: extractText in interface I_CmsTextExtractor
Overrides:: extractText in class A_CmsTextExtractor

Parameters:: in - the input stream for the document to extract the text from; encoding - the encoding to use
Returns:: the extracted text and meta information
Throws:: java.lang.Exception - if the text extration fails
See Also:: A_CmsTextExtractor.extractText(java.io.InputStream, java.lang.String)