CmsExtractionResult (OpenCms Core API, version 7.5.3)

Overview

Package

Class

Deprecated

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.opencms.search.extractors
Class CmsExtractionResult

java.lang.Object
  org.opencms.search.extractors.CmsExtractionResult

All Implemented Interfaces:: java.io.Serializable, I_CmsExtractionResult

public class CmsExtractionResult
extends java.lang.Object
implements I_CmsExtractionResult, java.io.Serializable
extends java.lang.Object
implements I_CmsExtractionResult, java.io.Serializable

The result of a document text extraction.

This data structure contains the extracted text as well as (optional) meta information extracted from the document.

Since:: 6.0.0
Version:: $Revision: 1.13 $
Author:: Alexander Kandzior
See Also:: Serialized Form

Field Summary

Fields inherited from interface org.opencms.search.extractors.I_CmsExtractionResult
`ITEM_AUTHOR, ITEM_CATEGORY, ITEM_COMMENTS, ITEM_COMPANY, ITEM_CONTENT, ITEM_CREATOR, ITEM_KEYWORDS, ITEM_MANAGER, ITEM_PRODUCER, ITEM_RAW, ITEM_SUBJECT, ITEM_TITLE`

Constructor Summary
`CmsExtractionResult(java.lang.String content)` Creates a new extration result without meta information and without additional fields.
`CmsExtractionResult(java.lang.String content, java.util.Map<java.lang.String,java.lang.String> contentItems)` Creates a new extraction result.

Method Summary
`static CmsExtractionResult`	`fromBytes(byte[] bytes)` Creates an extraction result from a serialized byte array.
`byte[]`	`getBytes()` Returns this extraction result serialized as a byte array.
`java.lang.String`	`getContent()` Returns the extracted content combined as a String.
`java.util.Map<java.lang.String,java.lang.String>`	`getContentItems()` Returns the extracted content as individual items.
`void`	`release()` Releases the information stored in this extraction result, to free up the memory used.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

CmsExtractionResult

public CmsExtractionResult(java.lang.String content)

Creates a new extration result without meta information and without additional fields.

Parameters:: content - the extracted content

CmsExtractionResult

public CmsExtractionResult(java.lang.String content,
                           java.util.Map<java.lang.String,java.lang.String> contentItems)

Creates a new extraction result.

Parameters:: content - the extracted content; contentItems - the individual extracted content items

Method Detail

fromBytes

public static final CmsExtractionResult fromBytes(byte[] bytes)

Creates an extraction result from a serialized byte array.

Parameters:: bytes - the serialized version of the extraction result
Returns:: extraction result created from the serialized byte array

getBytes

public byte[] getBytes()

Description copied from interface: I_CmsExtractionResult

Returns this extraction result serialized as a byte array.

Specified by:: getBytes in interface I_CmsExtractionResult

Returns:: this extraction result serialized as a byte array
See Also:: I_CmsExtractionResult.getBytes()

getContent

public java.lang.String getContent()

Description copied from interface: I_CmsExtractionResult

Returns the extracted content combined as a String.

Specified by:: getContent in interface I_CmsExtractionResult

Returns:: the extracted content combined as a String
See Also:: I_CmsExtractionResult.getContent()

getContentItems

public java.util.Map<java.lang.String,java.lang.String> getContentItems()

Description copied from interface: I_CmsExtractionResult

Returns the extracted content as individual items.

The result Map contains all content items extracted by the extractor. The key is always a String, and contains the name of the item. The value is also a String and contains the extracted text.

The detailed form will depend on the resource type indexed:

For a xmlpage, the key will be the element name, and the value will be the text of the element.
For a xmlcontent, the key will be the xpath of the XML node, and the value will be the text of that XML node.
In case the document contains meta information (for example PDF or MS Office documents), the meta information is stored with the name of the meta field as key and the content as value.
For all other resource types, there will be only ony key I_CmsExtractionResult.ITEM_CONTENT, which will contain the value of the complete content.

Specified by:: getContentItems in interface I_CmsExtractionResult

Returns:: the extracted content as individual items
See Also:: I_CmsExtractionResult.getContentItems()

release

public void release()

Description copied from interface: I_CmsExtractionResult

Releases the information stored in this extraction result, to free up the memory used.

Specified by:: release in interface I_CmsExtractionResult

See Also:: I_CmsExtractionResult.release()

Overview

Package

Class

Deprecated

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.opencms.search.extractors Class CmsExtractionResult

CmsExtractionResult

CmsExtractionResult

fromBytes

getBytes

getContent

getContentItems

release

org.opencms.search.extractors
Class CmsExtractionResult