toni rmc
toni rmc

Reputation: 878

PHP HTML Decode

I'm forced to work with HTML that looks like this:

<font color=3D=22=236EAED2=22 style=3D=22font-siz=
e:13px;font-family:arial;=22><B>Some Info Here</B></font></=
A></td></tr><tr><td><font color=3D=22=23FFFFFF=22 style=3D=22font-size:11=
px;font-family:arial;=22>192 Wellington Parade =7C Melbourne =7C  VIC =7C=
Australia 3002</font></td></tr><tr><td><font color=3D=22=23FFFFFF=22 st=
yle=3D=22font-size:11px;font-family:arial;=22>T: 61-<a href=3D=22=23=22 s=
tyle=3D=22color:=23FFFFFF; text-decoration:none=22>

It looks like " gets converted to =22 and so on. There is also other "codes" like =7C =3D, = before every new line and so on.

&nbsp; is =26nbsp;

Is there any function or technique for restoring to proper HTML?

Thank you.

Upvotes: 2

Views: 858

Answers (3)

Gumbo
Gumbo

Reputation: 655239

That’s the quoted-printable encoding, which can be decoded with quoted_printable_decode.

Like so:

$input= "the string that you need to be decoded";
$output = quoted_printable_decode($input);
echo $output;

Upvotes: 3

Pratik Joshi
Pratik Joshi

Reputation: 11693

use quoted_printable_decode("YOUR String to decode"); OR imap_qprint("Your String to decode")

Check FIDDLE

Description : quoted_printable_decode — Convert a quoted-printable string to an 8 bit string

his function returns an 8-bit binary string corresponding to the decoded quoted printable string (according to » RFC2045, section 6.7, not » RFC2821, section 4.5.2, so additional periods are not stripped from the beginning of line).

More Info and here too

Upvotes: 4

vcanales
vcanales

Reputation: 1828

It looks like they're simply replacing '%' with =. You say that =22 gets translated into '"', and %22 is the url encoding character for ".

Look here for a chart, and you can probably figure out a way to replace the characters with PHP.

Also, quotedprintable and htmlspecialchars could be of some use

Upvotes: 1

Related Questions