net.htmlparser.jericho
Class Config.CompatibilityMode

java.lang.Object
  extended by net.htmlparser.jericho.Config.CompatibilityMode
Enclosing class:
Config

public static final class Config.CompatibilityMode
extends Object

Represents a set of configuration parameters that relate to user agent compatibility issues.

The predefined compatibility modes IE, MOZILLA, OPERA and XHTML provide an easy means of ensuring the library interprets the markup in a way consistent with some of the most commonly used browsers, at least in relation to the behaviour described by the properties in this class.

The properties of any CompatibilityMode object can be modified individually, including those in the predefined instances as well as newly constructed instances. Take note however that modifying the properties of the predefined instances has a global affect.

The currently active compatibility mode is stored in the static Config.CurrentCompatibilityMode property.


Field Summary
static int CODE_POINTS_ALL
          Indicates the recognition of all unicode code points.
static int CODE_POINTS_NONE
          Indicates the recognition of no unicode code points.
static Config.CompatibilityMode IE
          Microsoft Internet Explorer compatibility mode.
static Config.CompatibilityMode MOZILLA
          Mozilla / Firefox / Netscape compatibility mode.
static Config.CompatibilityMode OPERA
          Opera compatibility mode.
static Config.CompatibilityMode XHTML
          XHTML compatibility mode.
 
Constructor Summary
Config.CompatibilityMode(String name)
          Constructs a new CompatibilityMode with the given name.
 
Method Summary
 String getDebugInfo()
          Returns a string representation of this object useful for debugging purposes.
 String getName()
          Returns the name of this compatibility mode.
 int getUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue)
          Returns the maximum unicode code point of an unterminated character entity reference which is to be recognised in the specified context.
 int getUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
          Returns the maximum unicode code point of an unterminated decimal character reference which is to be recognised in the specified context.
 int getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
          Returns the maximum unicode code point of an unterminated hexadecimal character reference which is to be recognised in the specified context.
 boolean isFormFieldNameCaseInsensitive()
          Indicates whether form field names are treated as case insensitive.
 void setFormFieldNameCaseInsensitive(boolean value)
          Sets whether form field names are treated as case insensitive.
 void setUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue, int maxCodePoint)
          Sets the maximum unicode code point of an unterminated character entity reference which is to be recognised in the specified context.
 void setUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue, int maxCodePoint)
          Sets the maximum unicode code point of an unterminated decimal character reference which is to be recognised in the specified context.
 void setUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue, int maxCodePoint)
          Sets the maximum unicode code point of an unterminated headecimal character reference which is to be recognised in the specified context.
 String toString()
          Returns the name of this compatibility mode.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

CODE_POINTS_ALL

public static final int CODE_POINTS_ALL
Indicates the recognition of all unicode code points.

This value is used in properties which specify a maximum unicode code point to be recognised by the parser.

See Also:
getUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue), getUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue), getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue), Constant Field Values

CODE_POINTS_NONE

public static final int CODE_POINTS_NONE
Indicates the recognition of no unicode code points.

This value is used in properties which specify a maximum unicode code point to be recognised by the parser.

See Also:
getUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue), getUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue), getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue), Constant Field Values

IE

public static final Config.CompatibilityMode IE
Microsoft Internet Explorer compatibility mode.

Name = IE
FormFieldNameCaseInsensitive = true

Recognition of unterminated character references:  (inside attribute)    (outside attribute)  
UnterminatedCharacterEntityReferenceMaxCodePoint =U+00FFU+00FF
UnterminatedDecimalCharacterReferenceMaxCodePoint =AllAll
UnterminatedHexadecimalCharacterReferenceMaxCodePoint =AllNone


MOZILLA

public static final Config.CompatibilityMode MOZILLA
Mozilla / Firefox / Netscape compatibility mode.

Name = Mozilla
FormFieldNameCaseInsensitive = false

Recognition of unterminated character references:  (inside attribute)    (outside attribute)  
UnterminatedCharacterEntityReferenceMaxCodePoint =U+00FFAll
UnterminatedDecimalCharacterReferenceMaxCodePoint =AllAll
UnterminatedHexadecimalCharacterReferenceMaxCodePoint =AllAll


OPERA

public static final Config.CompatibilityMode OPERA
Opera compatibility mode.

Name = Opera
FormFieldNameCaseInsensitive = true

Recognition of unterminated character references:  (inside attribute)    (outside attribute)  
UnterminatedCharacterEntityReferenceMaxCodePoint =U+003EAll
UnterminatedDecimalCharacterReferenceMaxCodePoint =AllAll
UnterminatedHexadecimalCharacterReferenceMaxCodePoint =AllAll


XHTML

public static final Config.CompatibilityMode XHTML
XHTML compatibility mode.

Name = XHTML
FormFieldNameCaseInsensitive = false

Recognition of unterminated character references:  (inside attribute)    (outside attribute)  
UnterminatedCharacterEntityReferenceMaxCodePoint =NoneNone
UnterminatedDecimalCharacterReferenceMaxCodePoint =NoneNone
UnterminatedHexadecimalCharacterReferenceMaxCodePoint =NoneNone

Constructor Detail

Config.CompatibilityMode

public Config.CompatibilityMode(String name)
Constructs a new CompatibilityMode with the given name.

All properties in the new instance are initially assigned their default values, which are the same as the strict rules of the XHTML compatibility mode.

Parameters:
name - the name of the new compatibility mode
Method Detail

getName

public String getName()
Returns the name of this compatibility mode.

Returns:
the name of this compatibility mode.

isFormFieldNameCaseInsensitive

public boolean isFormFieldNameCaseInsensitive()
Indicates whether form field names are treated as case insensitive.

Microsoft Internet Explorer treats field names as case insensitive, while Mozilla treats them as case sensitive.

The value of this property in the current compatibility mode affects all instances of the FormFields class. It should be set to the desired configuration before any instances of FormFields are created.

Returns:
true if form field names are treated as case insensitive, otherwise false.
See Also:
setFormFieldNameCaseInsensitive(boolean)

setFormFieldNameCaseInsensitive

public void setFormFieldNameCaseInsensitive(boolean value)
Sets whether form field names are treated as case insensitive.

See isFormFieldNameCaseInsensitive() for the documentation of this property.

Parameters:
value - the new value of the property

getUnterminatedCharacterEntityReferenceMaxCodePoint

public int getUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue)
Returns the maximum unicode code point of an unterminated character entity reference which is to be recognised in the specified context.

For example, if getUnterminatedCharacterEntityReferenceMaxCodePoint(true) has the value 0xFF (U+00FF) in the current compatibility mode, then:

See the documentation of the Attribute.getValue() method for further discussion.

Parameters:
insideAttributeValue - the context within an HTML document - true if inside an attribute value or false if outside an attribute value.
Returns:
the maximum unicode code point of an unterminated character entity reference which is to be recognised in the specified context.
See Also:
setUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue, int maxCodePoint)

setUnterminatedCharacterEntityReferenceMaxCodePoint

public void setUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue,
                                                                int maxCodePoint)
Sets the maximum unicode code point of an unterminated character entity reference which is to be recognised in the specified context.

See getUnterminatedCharacterEntityReferenceMaxCodePoint(boolean insideAttributeValue) for the documentation of this property.

Parameters:
insideAttributeValue - the context within an HTML document - true if inside an attribute value or false if outside an attribute value.
maxCodePoint - the maximum unicode code point.

getUnterminatedDecimalCharacterReferenceMaxCodePoint

public int getUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
Returns the maximum unicode code point of an unterminated decimal character reference which is to be recognised in the specified context.

For example, if getUnterminatedDecimalCharacterReferenceMaxCodePoint(true) had the hypothetical value 0xFF (U+00FF) in the current compatibility mode, then:

Parameters:
insideAttributeValue - the context within an HTML document - true if inside an attribute value or false if outside an attribute value.
Returns:
the maximum unicode code point of an unterminated decimal character reference which is to be recognised in the specified context.
See Also:
setUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue, int maxCodePoint)

setUnterminatedDecimalCharacterReferenceMaxCodePoint

public void setUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue,
                                                                 int maxCodePoint)
Sets the maximum unicode code point of an unterminated decimal character reference which is to be recognised in the specified context.

See getUnterminatedDecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue) for the documentation of this property.

Parameters:
insideAttributeValue - the context within an HTML document - true if inside an attribute value or false if outside an attribute value.
maxCodePoint - the maximum unicode code point.

getUnterminatedHexadecimalCharacterReferenceMaxCodePoint

public int getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue)
Returns the maximum unicode code point of an unterminated hexadecimal character reference which is to be recognised in the specified context.

For example, if getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(true) had the hypothetical value 0xFF (U+00FF) in the current compatibility mode, then:

Parameters:
insideAttributeValue - the context within an HTML document - true if inside an attribute value or false if outside an attribute value.
Returns:
the maximum unicode code point of an unterminated hexadecimal character reference which is to be recognised in the specified context.
See Also:
setUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue, int maxCodePoint)

setUnterminatedHexadecimalCharacterReferenceMaxCodePoint

public void setUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue,
                                                                     int maxCodePoint)
Sets the maximum unicode code point of an unterminated headecimal character reference which is to be recognised in the specified context.

See getUnterminatedHexadecimalCharacterReferenceMaxCodePoint(boolean insideAttributeValue) for the documentation of this property.

Parameters:
insideAttributeValue - the context within an HTML document - true if inside an attribute value or false if outside an attribute value.
maxCodePoint - the maximum unicode code point.

getDebugInfo

public String getDebugInfo()
Returns a string representation of this object useful for debugging purposes.

Returns:
a string representation of this object useful for debugging purposes.

toString

public String toString()
Returns the name of this compatibility mode.

Overrides:
toString in class Object
Returns:
the name of this compatibility mode.