Reputation: 1759
I have import all my wordpress content and now I want to replace all images with a placeholder image. The most obvious way I think is to search and replace all the images. I try to do it manually but the file is big enougth to make me rethinking this.
This is an example of wordpress exported XML file: https://wpcom-themes.svn.automattic.com/demo/theme-unit-test-data.xml
I would like to replace all images url with placehold.it url (http://placehold.it/)
I am using sublime text editor, is there a any regex to search all the image url on a XML file? I am really not very good with regex..
Thanks in advance!
Upvotes: 2
Views: 724
Reputation: 12709
regex:
(\<img\s+.*?src\s*=\s*)(?|"(.*?)"|\'(.*?)\')(.*?\/?\>)
replacement:
$1"http://placehold.it/"$3
If your editor supports regex search and replace then use above, else in PHP:
$string = preg_replace( '/(\<img\s+.*?src\s*=\s*)(?|"(.*?)"|\'(.*?)\')(.*?\/?\>)/is', '$1"http://placehold.it/"$3', $string );
Upvotes: 1
Reputation: 9530
A simple regex to replace all image src
attributes with some placeholder text would be:
Search for:
<img (.*?)src=".*?"
Replace with:
<img $1src="http://example.com"
If you want to use the placeholder URL, you could do:
<img (.*?)src=".*?"(.*?)width="(\d+)" height="(\d+)"
Replace with:
<img $1src="http://placehold.it/$3x$4"$2width="$3" height="$4"
Explanation:
.*?
means 0 or more characters\d+
means 1 or more digits(
and )
capture the contents of the parentheses and save it to $1
, $2
, $3
, etc.
<img (.*?)src
captures any characters between <img
and src
and saves them in $1
-- so if there is a class attribute, an ID, anything like that--it will be saved as $1
. .*?
can also match nothing, so $1
can also be blank.
width="(\d+)"
captures the digits that give the image width, and saves them to $3
(since it's the third set of parentheses in that regular expression).Upvotes: 3