|
Data not cleaned August 12, 2011 11:57AM |
Registered: 1 year ago Posts: 13 |
Hey,
I tried the demo of html purifier and it was exactly what I needed. I copied the text that was displayed in my web browser (it is a bunch of data mixed in with html tags, that is all on the web page) and I put it in the demo. It came out clean just like I needed it. I then got the lite version and followed all the instructions, got no errors with libraries but when i echoed it to the browser, the data was displayed but remained uncleaned. I am wondering if im doing something wrong?
this is the code
include('/Users/teddy/Desktop/htmlpurifier-4-1.3.0-lite/library/HTMLPurifier.auto.php');
$clean_html = $purifier->purify ( $man); //$man contains the data that needs to be cleaned
echo $clean_html;
|
Re: Data not cleaned August 13, 2011 11:58PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 08:28PM |
Registered: 1 year ago Posts: 13 |
include('simple_html_dom.php');
include('/Users/teddy/Desktop/htmlpurifier-4-1.3.0-lite/library/HTMLPurifier.auto.php');
$html = file_get_html('http://www.lottolore.com/lotto649.html');
foreach($html->find('table[cellpadding=2]') as $e)
{
for ($i=0; $i < sizeof($e->innertext); $i++)
{
$test[$i]= $e->innertext; $a = htmlentities($e->innertext);
$file= file_get_contents($a);
$bob=preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>', $a);
}
}
echo $bob;
//Thank you
|
Re: Data not cleaned August 15, 2011 08:29PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 08:59PM |
Registered: 1 year ago Posts: 13 |
Sorry I didnt include what I already posted first time around... This is the code when everything worked (though it does work even with html purifier, it just isnt cleaned).
Here is the whole thing:
include('simple_html_dom.php');
include('/Users/teddy/Desktop/htmlpurifier-4-1.3.0-lite/library/HTMLPurifier.auto.php');
$html = file_get_html('http://www.lottolore.com/lotto649.html');
foreach($html->find('table[cellpadding=2]') as $e)
{
for ($i=0; $i < sizeof($e->innertext); $i++)
{
$test[$i]= $e->innertext; $a = htmlentities($e->innertext);
$file= file_get_contents($a);
$man=preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>', $a);
}
}
echo $man;
$clean_html = $purifier->purify ( $man); //$man contains the data that needs to be cleaned
echo $clean_html;
//Thank you
|
Re: Data not cleaned August 15, 2011 09:03PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 09:07PM |
Registered: 1 year ago Posts: 13 |
Ugh sorry my mistake i am modifying code on the fly to try and make it work.
Here is the whole REAL thing:
include('simple_html_dom.php');
include('/Users/teddy/Desktop/htmlpurifier-4-1.3.0-lite/library/HTMLPurifier.auto.php');
$html = file_get_html('http://www.lottolore.com/lotto649.html');
foreach($html->find('table[cellpadding=2]') as $e)
{
for ($i=0; $i < sizeof($e->innertext); $i++)
{
$test[$i]= $e->innertext; $a = htmlentities($e->innertext);
$file= file_get_contents($a);
$man=preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>', $a);
}
}
$clean_html = $purifier->purify ( $man); //$man contains the data that needs to be cleaned
echo $clean_html; // $clean_html should contain clean data but it does not
//Thank you
|
Re: Data not cleaned August 15, 2011 09:08PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 09:22PM |
Registered: 1 year ago Posts: 13 |
|
Re: Data not cleaned August 15, 2011 09:30PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 09:36PM |
Registered: 1 year ago Posts: 13 |
<tr align="center"> <td colspan="5"><font size="4"><a name="past"><b>Past <font color="#FF0000">Lotto 6/49</font> Winning Numbers</a></b></font></td> </tr> <tr align="center"> <td><a href="lotto649.html"><b>Latest</b></a></td> <td><a href="l6490811.html"><b>Aug 11</b></a></td> <td><a href="l6490711.html"><b>Jul 11</b></a></td> <td><a href="l6490611.html"><b>Jun 11</b></a></td> <td><a href="l6490511.html"><b>May 11</b></a></td> </tr> <tr align="center"> <td><a href="l6490411.html"><b>Apr 11</b></a></td> <td><a href="l6490311.html"><b>Mar 11</b></a></td> <td><a href="l6490211.html"><b>Feb 11</b></a></td> <td><a href="l6490111.html"><b>Jan 11</b></a></td> <td><a href="l6491210.html"><b>Dec 10</b></a></td> </tr> <tr align="center"> <td><a href="l6491110.html"><b>Nov 10</b></a></td> <td><a href="l6491010.html"><b>Oct 10</b></a></td> <td><a href="l6490910.html"><b>Sep 10</b></a></td> <td><a href="l6490810.html"><b>Aug 10</b></a></td> <td><a href="l6490710.html"><b>Jul 10</b></a></td> </tr>
|
Re: Data not cleaned August 15, 2011 09:41PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 09:45PM |
Registered: 1 year ago Posts: 13 |
If you look at my prior post with the real complete code, you will see it is defined there.
Example from my code:
//some code before
$test[$i]= $e->innertext; $a = htmlentities($e->innertext);
$file= file_get_contents($a);
$man=preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>', $a);
}
}
$clean_html = $purifier->purify ( $man); //here the purifier object is defined
echo $clean_html; // $clean_html should contain clean data but it does not
|
Re: Data not cleaned August 15, 2011 09:46PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 09:50PM |
Registered: 1 year ago Posts: 13 |
I have not done that line. I thought it was declared with
$clean_html = $purifier->purify ( $man);
What needs to go after $purifier =. ?
I have posted my code exactly as it is in the .php file, I have not changed it for any reason. I assume my problem lies in the $Purifier declaration you mentioned.
|
Re: Data not cleaned August 15, 2011 10:09PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 10:15PM |
Registered: 1 year ago Posts: 13 |
No for some reason I removed it and did not return it after I was done meddling with it trying to get it to work.
I added it and now NOTHING is coming up in the web browser when I echo $clean_html. I even did an error report at the very top, nothing comes up as well as put it in different places in the code (though I think global declaration would have been good enough).
Any other suggestions?
|
Re: Data not cleaned August 15, 2011 10:16PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 10:38PM |
Registered: 1 year ago Posts: 13 |
<?
error_reporting(E_ALL);ini_set('display_errors', 1);
include('simple_html_dom.php');
include('/Users/teddy/Desktop/htmlpurifier-4-1.3.0-lite/library/HTMLPurifier.auto.php');
$html = file_get_html('http://www.lottolore.com/lotto649.html');
foreach($html->find('table[cellpadding=2]') as $e)
{
for ($i=0; $i < sizeof($e->innertext); $i++)
{
$test[$i]= $e->innertext;
$a = htmlentities($e->innertext);
$file= file_get_contents($a);
$jim=preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>', $a);
$purifier= new HTMLPurifier();
$clean_html = $purifier->purify ($jim)
echo $clean_html;
}
}
?>
thanks a bunch
|
Re: Data not cleaned August 15, 2011 10:40PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 10:53PM |
Registered: 1 year ago Posts: 13 |
Yes I did not catch that, thanks. Fixed it but still not working, give this a go if you can.
<?
error_reporting(E_ALL);ini_set('display_errors', 1);
include('simple_html_dom.php');
include('/Users/teddy/Desktop/htmlpurifier-4-1.3.0-lite/library/HTMLPurifier.auto.php');
$html = file_get_html('http://www.lottolore.com/lotto649.html');
foreach($html->find('table[cellpadding=2]') as $e)
{
for ($i=0; $i < sizeof($e->innertext); $i++)
{
$test[$i]= $e->innertext;
$a = htmlentities($e->innertext);
$jim=preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>', $a);
$purifier= new HTMLPurifier();
$clean_html = $purifier->purify ($jim);
echo $clean_html;
}
}
?>
|
Re: Data not cleaned August 15, 2011 11:04PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 15, 2011 11:11PM |
Registered: 1 year ago Posts: 13 |
I need to first retrieve the code from another website. The code comes back jumbled up with a bunch of HTML tags within what I need. I then do some more cleaning within the php to get rid of data I don't require. HTML purifier was supposed to just clean out the HTML code that I have fetched so it becomes readable. Do you have some suggestions regarding what I just said? Do you think the problem lies here?
|
Re: Data not cleaned August 15, 2011 11:13PM |
Admin Registered: 6 years ago Posts: 2,632 |
|
Re: Data not cleaned August 16, 2011 07:50AM |
Registered: 1 year ago Posts: 13 |
|
Re: Data not cleaned August 16, 2011 08:35AM |
Admin Registered: 6 years ago Posts: 2,632 |