org.opencms.util
Class CmsHtmlConverter

java.lang.Object
  extended by org.opencms.util.CmsHtmlConverter

public class CmsHtmlConverter
extends java.lang.Object

HTML cleaner and pretty printer.

Used to clean up HTML code (e.g. remove word tags) and optionally create XHTML from HTML.

Since:
6.0.0
Version:
$Revision: 1.39 $
Author:
Andreas Zahner

Field Summary
static java.lang.String PARAM_DISABLED
          Parameter value for disabled mode.
static java.lang.String PARAM_ENABLED
          Parameter value for enabled mode.
static java.lang.String PARAM_REPLACE_PARAGRAPHS
          Parameter value for replace paragraph mode.
static java.lang.String PARAM_WORD
          Parameter value for WORD mode.
static java.lang.String PARAM_XHTML
          Parameter value for XHTML mode.
static char SEPARATOR_MODES
          The separator used for the configured modes String.
 
Constructor Summary
CmsHtmlConverter()
          Constructor, creates a new CmsHtmlConverter.
CmsHtmlConverter(java.lang.String encoding, java.lang.String mode)
          Constructor, creates a new CmsHtmlConverter.
 
Method Summary
 byte[] convertToByte(byte[] htmlInput)
          Converts the given HTML code according to the settings of this converter.
 byte[] convertToByte(java.lang.String htmlInput)
          Converts the given HTML code according to the settings of this converter.
 byte[] convertToByteSilent(byte[] htmlInput)
          Converts the given HTML code according to the settings of this converter.
 byte[] convertToByteSilent(java.lang.String htmlInput)
          Converts the given HTML code according to the settings of this converter.
 java.lang.String convertToString(byte[] htmlInput)
          Converts the given HTML code according to the settings of this converter.
 java.lang.String convertToString(java.lang.String htmlInput)
          Converts the given HTML code according to the settings of the converter.
 java.lang.String convertToStringSilent(byte[] htmlInput)
          Converts the given HTML code according to the settings of this converter.
 java.lang.String convertToStringSilent(java.lang.String htmlInput)
          Converts the given HTML code according to the settings of this converter.
static java.lang.String getConversionSettings(CmsObject cms, CmsResource resource)
          Reads the content conversion property of a given resource and returns its value.
 java.lang.String getEncoding()
          Returns the encoding used for the HTML code conversion.
 java.lang.String getMode()
          Returns the conversion mode to use.
static boolean isConversionEnabled(java.lang.String conversionMode)
          Tests if the content conversion is enabled.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PARAM_DISABLED

public static final java.lang.String PARAM_DISABLED
Parameter value for disabled mode.


PARAM_ENABLED

public static final java.lang.String PARAM_ENABLED
Parameter value for enabled mode.


PARAM_REPLACE_PARAGRAPHS

public static final java.lang.String PARAM_REPLACE_PARAGRAPHS
Parameter value for replace paragraph mode.

See Also:
Constant Field Values

PARAM_WORD

public static final java.lang.String PARAM_WORD
Parameter value for WORD mode.

See Also:
Constant Field Values

PARAM_XHTML

public static final java.lang.String PARAM_XHTML
Parameter value for XHTML mode.

See Also:
Constant Field Values

SEPARATOR_MODES

public static final char SEPARATOR_MODES
The separator used for the configured modes String.

See Also:
Constant Field Values
Constructor Detail

CmsHtmlConverter

public CmsHtmlConverter()
Constructor, creates a new CmsHtmlConverter.

The encoding used by default is CmsEncoder.ENCODING_UTF_8.


CmsHtmlConverter

public CmsHtmlConverter(java.lang.String encoding,
                        java.lang.String mode)
Constructor, creates a new CmsHtmlConverter.

Possible values for the default conversion mode are:

Values can be combined with the ; separator, so it is e.g. possible to convert to XHTML and clean from word at the same time.

Parameters:
encoding - the encoding used for the HTML code conversion
mode - the conversion mode to use
Method Detail

getConversionSettings

public static java.lang.String getConversionSettings(CmsObject cms,
                                                     CmsResource resource)
Reads the content conversion property of a given resource and returns its value.

A default value (disabled) is returned if the property could not be read.

Parameters:
cms - the CmsObject
resource - the resource in the VFS
Returns:
the content conversion property value

isConversionEnabled

public static boolean isConversionEnabled(java.lang.String conversionMode)
Tests if the content conversion is enabled.

Parameters:
conversionMode - the content conversion mode string
Returns:
true or false

convertToByte

public byte[] convertToByte(byte[] htmlInput)
                     throws java.io.UnsupportedEncodingException
Converts the given HTML code according to the settings of this converter.

Parameters:
htmlInput - HTML input stored in an array of bytes
Returns:
array of bytes containing the converted HTML
Throws:
java.io.UnsupportedEncodingException - if the encoding set for the conversion is not supported

convertToByte

public byte[] convertToByte(java.lang.String htmlInput)
                     throws java.io.UnsupportedEncodingException
Converts the given HTML code according to the settings of this converter.

Parameters:
htmlInput - HTML input stored in a string
Returns:
array of bytes containing the converted HTML
Throws:
java.io.UnsupportedEncodingException - if the encoding set for the conversion is not supported

convertToByteSilent

public byte[] convertToByteSilent(byte[] htmlInput)
Converts the given HTML code according to the settings of this converter.

If an any error occurs during the conversion process, the original input is returned unmodified.

Parameters:
htmlInput - HTML input stored in an array of bytes
Returns:
array of bytes containing the converted HTML

convertToByteSilent

public byte[] convertToByteSilent(java.lang.String htmlInput)
Converts the given HTML code according to the settings of this converter.

If an any error occurs during the conversion process, the original input is returned unmodified.

Parameters:
htmlInput - HTML input stored in a string
Returns:
array of bytes containing the converted HTML

convertToString

public java.lang.String convertToString(byte[] htmlInput)
                                 throws java.io.UnsupportedEncodingException
Converts the given HTML code according to the settings of this converter.

Parameters:
htmlInput - HTML input stored in an array of bytes
Returns:
string containing the converted HTML
Throws:
java.io.UnsupportedEncodingException - if the encoding set for the conversion is not supported

convertToString

public java.lang.String convertToString(java.lang.String htmlInput)
                                 throws java.io.UnsupportedEncodingException
Converts the given HTML code according to the settings of the converter.

Parameters:
htmlInput - HTML input stored in a string
Returns:
string containing the converted HTML
Throws:
java.io.UnsupportedEncodingException - if the encoding set for the conversion is not supported

convertToStringSilent

public java.lang.String convertToStringSilent(byte[] htmlInput)
Converts the given HTML code according to the settings of this converter.

If an any error occurs during the conversion process, the original input is returned unmodified.

Parameters:
htmlInput - HTML input stored in an array of bytes
Returns:
string containing the converted HTML

convertToStringSilent

public java.lang.String convertToStringSilent(java.lang.String htmlInput)
Converts the given HTML code according to the settings of this converter.

If an any error occurs during the conversion process, the original input is returned unmodified.

Parameters:
htmlInput - HTML input stored in string
Returns:
string containing the converted HTML

getEncoding

public java.lang.String getEncoding()
Returns the encoding used for the HTML code conversion.

Returns:
the encoding used for the HTML code conversion

getMode

public java.lang.String getMode()
Returns the conversion mode to use.

Returns:
the conversion mode to use