nu.validator.htmlparser.impl
Class NormalizationChecker

java.lang.Object
  extended by nu.validator.htmlparser.impl.NormalizationChecker
All Implemented Interfaces:
CharacterHandler

public final class NormalizationChecker
extends Object
implements CharacterHandler

Version:
$Id: NormalizationChecker.java 155 2007-09-13 11:54:32Z hsivonen $
Author:
hsivonen

Field Summary
private  boolean alreadyComplainedAboutThisRun
          Indicates whether the current run has already caused an error.
private  boolean atStartOfRun
          Indicates whether the checker the next call to characters() is the first call in a run.
private  char[] buf
          A buffer for holding sequences overlap the SAX buffer boundary.
private  char[] bufHolder
          A holder for the original buffer (for the memory leak prevention mechanism).
private static com.ibm.icu.text.UnicodeSet COMPOSING_CHARACTERS
          A thread-safe set of composing characters as per Charmod Norm.
private  ErrorHandler errorHandler
           
private  Locator locator
           
private  int pos
          The current used length of the buffer, i.e.
 
Constructor Summary
NormalizationChecker(Locator locator)
          Constructor with mode selection.
 
Method Summary
private  void appendToBuf(char[] ch, int start, int end)
          Appends a slice of an UTF-16 code unit array to the internal buffer.
 void characters(char[] ch, int start, int length)
           
 void end()
           
 void err(String message)
          Emit an error.
private  void errAboutTextRun()
          Emits an error stating that the current text run or the source text is not in NFC.
private static boolean isComposingChar(int c)
          Returns true if the argument is a composing character and false otherwise.
private static boolean isComposingCharOrSurrogate(char c)
          Returns true if the argument is a composing BMP character or a surrogate and false otherwise.
 void setErrorHandler(ErrorHandler errorHandler)
           
 void start()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

errorHandler

private ErrorHandler errorHandler

locator

private Locator locator

COMPOSING_CHARACTERS

private static final com.ibm.icu.text.UnicodeSet COMPOSING_CHARACTERS
A thread-safe set of composing characters as per Charmod Norm.


buf

private char[] buf
A buffer for holding sequences overlap the SAX buffer boundary.


bufHolder

private char[] bufHolder
A holder for the original buffer (for the memory leak prevention mechanism).


pos

private int pos
The current used length of the buffer, i.e. the index of the first slot that does not hold current data.


atStartOfRun

private boolean atStartOfRun
Indicates whether the checker the next call to characters() is the first call in a run.


alreadyComplainedAboutThisRun

private boolean alreadyComplainedAboutThisRun
Indicates whether the current run has already caused an error.

Constructor Detail

NormalizationChecker

public NormalizationChecker(Locator locator)
Constructor with mode selection.

Parameters:
sourceTextMode - whether the source text-related messages should be enabled.
Method Detail

err

public void err(String message)
         throws SAXException
Emit an error. The locator is used.

Parameters:
message - the error message
Throws:
SAXException - if something goes wrong

isComposingCharOrSurrogate

private static boolean isComposingCharOrSurrogate(char c)
Returns true if the argument is a composing BMP character or a surrogate and false otherwise.

Parameters:
c - a UTF-16 code unit
Returns:
true if the argument is a composing BMP character or a surrogate and false otherwise

isComposingChar

private static boolean isComposingChar(int c)
Returns true if the argument is a composing character and false otherwise.

Parameters:
c - a Unicode code point
Returns:
true if the argument is a composing character false otherwise

start

public void start()
Specified by:
start in interface CharacterHandler
See Also:
CharacterHandler.start()

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws SAXException
Specified by:
characters in interface CharacterHandler
Throws:
SAXException
See Also:
CharacterHandler.characters(char[], int, int)

errAboutTextRun

private void errAboutTextRun()
                      throws SAXException
Emits an error stating that the current text run or the source text is not in NFC.

Throws:
SAXException - if the ErrorHandler throws

appendToBuf

private void appendToBuf(char[] ch,
                         int start,
                         int end)
Appends a slice of an UTF-16 code unit array to the internal buffer.

Parameters:
ch - the array from which to copy
start - the index of the first element that is copied
end - the index of the first element that is not copied

end

public void end()
         throws SAXException
Specified by:
end in interface CharacterHandler
Throws:
SAXException
See Also:
CharacterHandler.end()

setErrorHandler

public void setErrorHandler(ErrorHandler errorHandler)