|
Memory error March 02, 2011 01:15PM |
Registered: 4 years ago Posts: 62 |
We get a memory exhausted error using the standalone version.
It reports the memory was used up on this code block:
foreach ($elements as $i => $x) {
$elements[$i] = true;
if (empty($i)) unset($elements[$i]); // remove blank
}
On this line:
$elements[$i] = true;
Any ideas on things to check?
|
Re: Memory error March 02, 2011 05:00PM |
Admin Registered: 6 years ago Posts: 2,640 |
|
Re: Memory error March 02, 2011 06:59PM |
Registered: 4 years ago Posts: 62 |
|
Re: Memory error March 03, 2011 12:23PM |
Admin Registered: 6 years ago Posts: 2,640 |
|
Re: Memory error March 03, 2011 12:26PM |
Registered: 4 years ago Posts: 62 |
Hello,
Well we implemented it "application wide" for security purposes.
I read your documentation on speeding up HTML Purifier and I got a general sense that you do not recommend running "purify" on every output variable? I understand this may cause a slow down, but would it also cause this memory error?
I made sure that we only create one HTML purifier object and use it for all the purification.
|
Re: Memory error March 03, 2011 12:28PM |
Admin Registered: 6 years ago Posts: 2,640 |
Though HTML Purifier is kind of slow and uses lots of memory, its steady-state memory usage should not be more than 1M (I can, for example, run the test suite no problem with only 1M of memory). So if you are indeed seeing 32M being used up, something is off, or it's not HTML Purifier's fault (but we just happened to try to grab the last bit of memory :-)
|
Re: Memory error March 03, 2011 12:37PM |
Registered: 4 years ago Posts: 62 |
I did a simple test in the application.
With htmlpurifier+flashcompat+safeobject memory usage of a sample page in the application was 26.9M.
With htmlpurifier without those settings enabled, memory usage was 8.7M.
Is there any type of debug code I can enable or anything I can check?
Keep in mind this is with multiple calls to the purifier method. Here is our config:
$this->config = HTMLPurifier_Config::createDefault();
$this->config->set('Core', 'Encoding', CHARSET);
$this->config->set('HTML', 'Doctype', 'HTML 4.01 Transitional');
$this->config->set('Cache', 'DefinitionImpl', null);
$this->config->set('HTML', 'TidyLevel', 'none');
//$this->config->set('Output','FlashCompat',true);
//$this->config->set('HTML.SafeObject', true);
$this->config->set('HTML', 'AllowedElements', null);
$this->config->set('HTML', 'AllowedAttributes', null);
|
Re: Memory error March 03, 2011 12:39PM |
Admin Registered: 6 years ago Posts: 2,640 |
|
Re: Memory error March 03, 2011 03:19PM |
Registered: 5 years ago Posts: 204 |
Well we implemented it "application wide" for security purposes.
exactly what do you mean by this & how are you using it?
an example of the HTML code that you are trying to purify will help to narrow it down.
I interpret your post as though you are purifying the whole page sourcecode itself before it's outputted to the browser.
|
Re: Memory error March 03, 2011 03:28PM |
Registered: 4 years ago Posts: 62 |
|
Re: Memory error March 03, 2011 03:49PM |
Registered: 5 years ago Posts: 204 |
ok :)
an example of the html that causes the memory jump would be required to see if we can replicate the issue.
personally from using the standalone & full package myself, i have found the standalone version to be less of a resource hog on the server.
our CMS uses it for HTML content, & i haven't noticed this increase when we use safeobject & flashcompat on our system, though we did have to introduce a minimum memory spec on our newer versions which 16mb was enough. but we kept getting errors in some instances when using 8mb. with more tweaking and code restructuring in our latest cms which has reduced a lot of SQL queries etc, we have now got the minimum required down to about 10mb.
but i've never experienced anyone having an issue like that when they have above 16mb limits.
incidentally though,
$this->config->set('HTML', 'AllowedElements', null);
$this->config->set('HTML', 'AllowedAttributes', null);
are you actually allowing any attributes/elements or are they null all the time? i've never tried using null in allowedElements directive like that.
|
Re: Memory error March 03, 2011 03:54PM |
Registered: 4 years ago Posts: 62 |
|
Re: Memory error March 03, 2011 04:32PM |
Registered: 4 years ago Posts: 62 |
Our profiler is showing these two lines:
$elements[$i] = true; if (empty($i)) unset($elements[$i]); // remove blank
Inside the constructor of this class: class HTMLPurifier_ChildDef_Required extends HTMLPurifier_ChildDef
Are getting hit 28041 times. This seems like an awful lot?
As far as timings go these are the second largest time consumer in our application other than the inclusion of the HTMLPurifier.standalone.php file.
The third largest is:
if ($required = (strpos($def_i, '*') !== false)) {
In:
public function expandIdentifiers(&$attr, $attr_types) {
In class HTMLPurifier_AttrCollections.
Does this help at all?
|
Re: Memory error March 03, 2011 05:51PM |
Admin Registered: 6 years ago Posts: 2,640 |
It's been a while since I've last profiled HTML Purifier, so it could very well be an inefficiency. However, constructing HTML definitions is pretty resource intensive work and we cache the results, so 28041 doesn't actually seem that large.
However, it seems to me that you are instantiating a lot of HTMLDefinitions. How many different configurations are you using, and is your caching working?
|
Re: Memory error March 03, 2011 05:58PM |
Registered: 5 years ago Posts: 204 |
|
Re: Memory error March 03, 2011 06:24PM |
Registered: 4 years ago Posts: 62 |
It's been a while since I've last profiled HTML Purifier, so it could very well be an inefficiency. However, constructing HTML definitions is pretty resource intensive work and we cache the results, so 28041 doesn't actually seem that large.
However, it seems to me that you are instantiating a lot of HTMLDefinitions. How many different configurations are you using, and is your caching working?
We technically have 2 configurations. One is no tags/attributes allowed (null). The other is a fairly small subset of HTML tags/attributes. The set we profiled with is smaller than the set you allow here on the forums.
For caching when exactly should we be clearing the cache? Any time our allowed tags/attributes change? It doesn't cache the filtered HTML does it?
I fixed some caching issues and this dramatically dropped those two lines of code.
Now the line of code eating the most time is:
return unserialize(file_get_contents($file));
Which looks to be from the serializer because I fixed caching. I assume not much can be done about that.
However this line is next in line for speed:
list($ns, $key) = explode('.', $name, 2);
In HTMLPufifier.standalone.php
I notice in the comments you say:
/**
* Retrieves all directives, organized by namespace
* @warning This is a pretty inefficient function, avoid if you can
*/
How can this be avoided? Checking the code I don't see how it can be avoided?
|
Re: Memory error March 03, 2011 06:26PM |
Registered: 4 years ago Posts: 62 |
|
Re: Memory error March 03, 2011 06:28PM |
Admin Registered: 6 years ago Posts: 2,640 |
|
Re: Memory error March 03, 2011 06:34PM |
Admin Registered: 6 years ago Posts: 2,640 |
By default, your cache is stored in library/HTMLPurifier/DefinitionCache/Serializer/HTML. Let me know how many files are there; it will give me a sense for how many different HTML definitions you are using.
Not really sure about memory profiler; I recall using xdebug with some success in the past.
|
Re: Memory error March 04, 2011 12:27PM |
Registered: 4 years ago Posts: 62 |
Hi,
Cache has two folders. HTML URI
URI has one file. HTML has 4 files.
The timing issue was above was just for your reference. And mainly because your comments on that function said "This is a pretty inefficient function, avoid if you can". Which made me think there was something I could do to avoid it, but it appears to be essential for the operation of htmlpurifier.
As far as the main issue goes (memory) it seems that every single call I make to the purify method increases the memory a little bit. It is as if something is getting stored in class variable which never gets reset or cleared for the next call to purify. Any ideas? I guess I could always destroy the object after every call to purify and create a new one but this seems inefficient as well.
|
Re: Memory error March 04, 2011 01:29PM |
Registered: 4 years ago Posts: 62 |
|
Re: Memory error March 18, 2011 01:12AM |
Registered: 4 years ago Posts: 62 |
Hi,
Going back to this... we no longer have issues with the timings, just the huge memory consumption.
It is narrowed down to this line:
$this->config->set('HTML','SafeObject', true);
If we comment out these lines memory consumption is around 9.1mb.
If we turn on:
$this->config->set('HTML','SafeObject', true);
We hit the 32mb limit we have in place for testing.
Any ideas on what to check?
|
Re: Memory error March 18, 2011 02:10AM |
Registered: 4 years ago Posts: 62 |
I ended up saving each configuration object in a class variable which is an array.
I did this after reading: http://htmlpurifier.org/phorum/read.php?3,4718
Where you mentioned: It is also likely that if you are using multiple configurations, they are all being loaded into memory and not being freed in interest of keeping the purifier “hot”; you could try reconducting the test with another random HTML directive to see if this is the case.
We still save each configuration in the class variable so it still eats up memory but for some reason once we did this the memory stopped getting gobbled up. Really don't have an explanation for it as our code before did not really do much different.
To confirm is this the best way to get the config object? <pre><![CDATA[ $config = HTMLPurifier_Config::createDefault();]]></pre>
|
Re: Memory error March 18, 2011 07:57AM |
Registered: 5 years ago Posts: 204 |
|
Re: Memory error March 18, 2011 12:02PM |
Registered: 4 years ago Posts: 62 |
|
Re: Memory error March 18, 2011 05:25PM |
Admin Registered: 6 years ago Posts: 2,640 |