Reputation: 31
I've export my bookmarks from FF in to a html file but it's too huge and complicated, so I need to remove some firefox lines from it to make it more lighter and plain.
I can replace basic things in the Notepad++ but I guess I do need some operators for this and I have no idea how to make it work right.
For example here is the line from the file containing a link to Logodesignlove :
<A HREF="http://www.logodesignlove.com/" ADD_DATE="1256428672" LAST_MODIFIED="1256428672" ICON_URI="http://www.logodesignlove.com/favicon.ico" ICON="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAABDUlEQVQ4jWNgGF7gy9a9iS88Yw4803F49a6oYfHn589FGRgYGD4vWZv70iX80HMrv9MfF6zMw6r5Q/ukjkcMUv+R8TNzn+sv/eNPoou/753ZhKL5x8OHSo/Y5P+gK8SFH3Io//j+7Jk8wum79scQqxmGv2zcFQM34Ouhk96kGvBp5cZUuAGfnz8Xfcil8otoA5hl//+8cU8PJRxeJxZtJtaAlz5xJxkYGBhRDPh1/77BQ26V7wQDkFPp+9crN02xRuWnxavL8RrAIvv/8+otWXgT0/vu6ZMfMclgtZmgZrhLlm9MfSKi/Rmm+bm517VvF69ZEKUZBr68fCn+oWNK68cpC+qePXvGRZJmUgAAVs4XULOHB/oAAAAASUVORK5CYII=">Logo Design Love</A>
I need to remove all those tags I don't care about, like LAST_MODIFIED="1256428672", ICON_URI="bunch of digits" ICON="bunch of characters" etc. And of course I need to remove all those tags in every link in the list.
So I was thinking like use something like "Find all tags LAST_MODIFIED="anynumbers" and replace it with nothing/remove it" - it doesn't work though.
Examle how it should like:
<A HREF="http://www.logodesignlove.com/">Logo Design Love</A>
So far I removed LAST_MODIFIED and ADD_DATE lines thanks to Aleksandr. So LAST_MODIFIED="\d+" worked just fine. But ICON and ICON_URI are still there. I've tried ICON="\w+" - but it doesn't work. I guess it has something to do with the slashes.
Upvotes: 0
Views: 553
Reputation: 28409
Why look for what you don't want when it's easier to keep hold of what you do want and drop the junk?
(<A HREF=".*?").*?(>.*?>)
with
$1$2
Code edited to suit Notepad++ now I know it doesn't need the special chars escaped. Thanks Aleksandr.
Upvotes: 1
Reputation: 23505
Read up on using regular expressions (the java regex tutorials are a good start http://docs.oracle.com/javase/tutorial/essential/regex/), and try one of the online regex tools to help write and test it, such as this one http://gskinner.com/RegExr/
Eg, remove "LAST_MODIF..." with the regex LAST_MODIFIED="\d+"
Otherwise, you may want an XML-specific tool, or even write an XSL. However, I don't know much about that.
Upvotes: 0