Reputation: 2732
I need to pull the content from the database on the page, but some of this contents have the whole HTML page - with css, head, etc...
What would be the best way prevent having all htlm tags, scripts, css? Would iframe help here?
The most bothering thing is that I'm getting strange characters on the page: � and as found out it is due to different encoding.
The site has utf-8 encoding and if the content contains different encoding, these signs come out and I cannot replace them. The only thing it make them remove was to change my encoding, but this is not the real solution.
If someone could tell me how to remove them, would be really great.
Solution: with your help I checked encoding, but couldn't change it. I set names in mysql_query to UTF-8, and stripped unusefull tags. Now it seems ok. Thanks to all of you.
Upvotes: 1
Views: 94
Reputation: 2539
The � tags in fact might not be due to encoding, the problem might be the content that is stored in the database. Check for double quotes like “ which are supposed to be ", more so if the data in the table was copy pasted.
Upvotes: 1
Reputation: 2332
I think you have no chance apart an ugly iframe. About encoding, you should check db encoding, connection encoding and convert as needed. Use iconv
for full control over conversion, for example:
$html=iconv("UTF-8", "ISO-8859-15"."//TRANSLIT//IGNORE",$html]);
In this case, you're going to lose some characters not mapped in ISO-8859-15. Consider moving your whole site to UTF-8 encoding.
Upvotes: 2