HTMLPurifier 4.4.0
HTMLPurifier_Lexer_DirectLex Class Reference

Our in-house implementation of a parser. More...

Inheritance diagram for HTMLPurifier_Lexer_DirectLex:
HTMLPurifier_Lexer HTMLPurifier_Lexer

List of all members.

Public Member Functions

 tokenizeHTML ($html, $config, $context)
 Lexes an HTML string into tokens.
 parseAttributeString ($string, $config, $context)
 Takes the inside of an HTML tag and makes an assoc array of attributes.
 tokenizeHTML ($html, $config, $context)
 Lexes an HTML string into tokens.
 parseAttributeString ($string, $config, $context)
 Takes the inside of an HTML tag and makes an assoc array of attributes.

Public Attributes

 $tracksLineNumbers = true
 Whether or not this lexer implements line-number/column-number tracking.

Protected Member Functions

 scriptCallback ($matches)
 Callback function for script CDATA fudge.
 substrCount ($haystack, $needle, $offset, $length)
 PHP 5.0.x compatible substr_count that implements offset and length.
 scriptCallback ($matches)
 Callback function for script CDATA fudge.
 substrCount ($haystack, $needle, $offset, $length)
 PHP 5.0.x compatible substr_count that implements offset and length.

Protected Attributes

 $_whitespace = "\x20\x09\x0D\x0A"
 Whitespace characters for str(c)spn.

Detailed Description

Our in-house implementation of a parser.

A pure PHP parser, DirectLex has absolutely no dependencies, making it a reasonably good default for PHP4. Written with efficiency in mind, it can be four times faster than HTMLPurifier_Lexer_PEARSax3, although it pales in comparison to HTMLPurifier_Lexer_DOMLex.

Todo:
Reread XML spec and document differences.

A pure PHP parser, DirectLex has absolutely no dependencies, making it a reasonably good default for PHP4. Written with efficiency in mind, it can be four times faster than HTMLPurifier_Lexer_PEARSax3, although it pales in comparison to HTMLPurifier_Lexer_DOMLex.

Todo:
Reread XML spec and document differences.

Definition at line 13 of file DirectLex.php.


Member Function Documentation

HTMLPurifier_Lexer_DirectLex::parseAttributeString ( string,
config,
context 
)

Takes the inside of an HTML tag and makes an assoc array of attributes.

Parameters:
$stringInside of tag excluding name.
Returns:
Assoc array of attributes.

Definition at line 342 of file DirectLex.php.

References $config, and HTMLPurifier_Lexer::parseData().

Referenced by tokenizeHTML().

HTMLPurifier_Lexer_DirectLex::parseAttributeString ( string,
config,
context 
)

Takes the inside of an HTML tag and makes an assoc array of attributes.

Parameters:
$stringInside of tag excluding name.
Returns:
Assoc array of attributes.

Definition at line 15266 of file HTMLPurifier.standalone.php.

References $config, and HTMLPurifier_Lexer::parseData().

HTMLPurifier_Lexer_DirectLex::scriptCallback ( matches) [protected]

Callback function for script CDATA fudge.

Parameters:
$matches,inform of array(opening tag, contents, closing tag)

Definition at line 27 of file DirectLex.php.

HTMLPurifier_Lexer_DirectLex::scriptCallback ( matches) [protected]

Callback function for script CDATA fudge.

Parameters:
$matches,inform of array(opening tag, contents, closing tag)

Definition at line 14951 of file HTMLPurifier.standalone.php.

HTMLPurifier_Lexer_DirectLex::substrCount ( haystack,
needle,
offset,
length 
) [protected]

PHP 5.0.x compatible substr_count that implements offset and length.

Definition at line 15247 of file HTMLPurifier.standalone.php.

HTMLPurifier_Lexer_DirectLex::substrCount ( haystack,
needle,
offset,
length 
) [protected]

PHP 5.0.x compatible substr_count that implements offset and length.

Definition at line 323 of file DirectLex.php.

Referenced by tokenizeHTML().

HTMLPurifier_Lexer_DirectLex::tokenizeHTML ( string,
config,
context 
)

Lexes an HTML string into tokens.

Parameters:
$stringString HTML.
Returns:
HTMLPurifier_Token array representation of HTML.

Reimplemented from HTMLPurifier_Lexer.

Definition at line 31 of file DirectLex.php.

References $config, $html, HTMLPurifier_Lexer::normalize(), parseAttributeString(), HTMLPurifier_Lexer::parseData(), and substrCount().

HTMLPurifier_Lexer_DirectLex::tokenizeHTML ( string,
config,
context 
)

Lexes an HTML string into tokens.

Parameters:
$stringString HTML.
Returns:
HTMLPurifier_Token array representation of HTML.

Reimplemented from HTMLPurifier_Lexer.

Definition at line 14955 of file HTMLPurifier.standalone.php.

References $config, $html, HTMLPurifier_Lexer::normalize(), parseAttributeString(), HTMLPurifier_Lexer::parseData(), and substrCount().


Member Data Documentation

HTMLPurifier_Lexer_DirectLex::$_whitespace = "\x20\x09\x0D\x0A" [protected]

Whitespace characters for str(c)spn.

Definition at line 21 of file DirectLex.php.

HTMLPurifier_Lexer_DirectLex::$tracksLineNumbers = true

Whether or not this lexer implements line-number/column-number tracking.

If it does, set to true.

Reimplemented from HTMLPurifier_Lexer.

Definition at line 16 of file DirectLex.php.


The documentation for this class was generated from the following files: