|
Need short call to remove all HTML TAG because of memory problem April 07, 2011 11:15AM |
Registered: 2 years ago Posts: 3 |
Hi,
I want to remove all HTML tags from HTML pages.
I'd like to know if there is a better way than making this call :
require_once('htmlpurifier/library/HTMLPurifier.auto.php');
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML', 'Allowed', ''); // Allow Nothing
$purifier = new HTMLPurifier($config);
return $purifier->purify($html);
I get :
Fatal error: Allowed memory size of 52428800 bytes exhausted (tried to allocate 71 bytes) in /home/httpd/htdocs/lib/htmlpurifier-4.3.0/library/HTMLPurifier/Lexer/DOMLex.php on line 177 Call Stack: 89.4199 15980456 1. scanWords->extractText() /home/httpd/htdocs/test/scanWords.php:287 89.4343 16653936 2. HTMLPurifier->purify() /home/httpd/htdocs/test/scanWords.php:648 89.4351 16668952 3. HTMLPurifier_Lexer_DOMLex->tokenizeHTML() /home/httpd/htdocs/lib/htmlpurifier-4.3.0/library/HTMLPurifier.php:179 91.2438 18272472 4. HTMLPurifier_Lexer_DOMLex->tokenizeDOM() /home/httpd/htdocs/lib/htmlpurifier-4.3.0/library/HTMLPurifier/Lexer/DOMLex.php:70 91.7585 52386104 5. HTMLPurifier_Lexer_DOMLex->createEndNode() /home/httpd/htdocs/lib/htmlpurifier-4.3.0/library/HTMLPurifier/Lexer/DOMLex.php:105
The page tested was a 580 Ko page size. My admin team do not want to change the configuration of the PHP memory allocation.
So may be could I call Purifier in a lighter way to get same result ? (only text!)
Any idea are welcome.
Thanks per advance.
|
Re: Need short call to remove all HTML TAG because of memory problem April 07, 2011 11:29AM |
Admin Registered: 6 years ago Posts: 2,640 |
|
Re: Need short call to remove all HTML TAG because of memory problem April 07, 2011 11:41AM |
Registered: 2 years ago Posts: 3 |
Ambush Commander said :
striptags and then htmlentities.
??? no ... I want to leave correctly as is doing perfeclty Purifier the scripts and other malformed tags. PHP strip_tags functions is so buggy ! I can't use them ...
What I really want to know is there is any option to not going throught filters for example, or accessing just to the earth call of cleanning tags in Purifier (to get less memory usage). This tool is so good ... and works better than php functions.
Please you migh you this ...
|
Re: Need short call to remove all HTML TAG because of memory problem April 07, 2011 11:46AM |
Admin Registered: 6 years ago Posts: 2,640 |
You're running out of memory in the tokenization stage, so it's the internal representation of the HTML that's killing you. You might have some luck setting %Core.LexerImpl to DirectLex, or try using ini_set to bump the memory limit, but otherwise, you're out of luck.
|
Re: Need short call to remove all HTML TAG because of memory problem April 07, 2011 01:58PM |
Registered: 2 years ago Posts: 3 |