Reputation: 219
I have a table with a bunch of text and URL's.
I need to have any string (URL) that has the format http://anything.anything.anything
and replace it with http://www.google.com
so any and all URL's get replaced with the latter.
I have tried online but I could not find anything. This should be a mysql command that I can run at certain times to replace any and all URL's.
Upvotes: 0
Views: 277
Reputation:
There are things that you can do. Realize that this is far from optimal - I can heard the DBAs out there screaming at the prospect of doing an rlike
within a join condition.
All of this is in sqlfiddle: http://sqlfiddle.com/#!2/a3ea9/1/0
Consider the tables and data of:
create table data (
url varchar(255)
);
create table repls (
regex varchar(255),
repl varchar(255)
);
insert into data values
("http://example.com"),
("http://google.com"),
("http://stackoverflow.com"),
("http://space.stackexchange.com");
insert into repls values
('.*ex.*', 'http://www.yahoo.com'),
('.*google.*', 'http://www.bing.com'),
('.*stack.*', 'http://programmers.stackexchange.com');
You've got a table of urls, and a table of regexes and the replacements.
Then, with a 'simple' query in a select statement...
select
D.url,
R.repl
from
data D
join repls R on (
D.url rlike R.regex
)
We can get all of the urls and what they are replaced with. However, there's a gotcha that I embedded in there.
With this data and query it returns 5 rows:
url repl http://www.example.com http://www.yahoo.com http://google.com http://www.bing.com http://stackoverflow.com http://programmers.stackexchange.com http://space.stackexchange.com http://www.yahoo.com http://space.stackexchange.com http://programmers.stackexchange.com
The first thing to note here is that this is a single replacement. It doesn't do a replacement until nothing matches anymore. This is a good thing in this case because we'd have an infinite loop with http://programmers.stackexchange.com
triggering on the .*stack.*
rule (it would also trigger on the .*ex.*
rule too, but that's not an infinite loop because the replacement there doesn't contain ex
).
Second thing of note is that something that matches two rows will be in there twice. You would need additional logic to handle precedence of rules.
Lastly, this is a wholesale replacement. There are no fancy pattern matching groups so that the additional path information on some site gets moved into the replaced string's text.
Its probably possible to do all these things, but it gets quite a bit more involved.
For doing this work, it is probably best to pull the data
down and process it in a language that can handle the changes.
On the other hand, if you wanted instead to do a wholesale replacement of the data with an update statement (rather than pulling it down and reissuing the associated update statements), or if the system the data is moving to is more cumbersome and error prone than the above example, this might be an approach to investigate.
Upvotes: 1
Reputation: 2888
You want to do a Regular Expression replace, which is not a MySQL native function although there have been various questions about this before.
How to do a regular expression replace in MySQL?
If you're not limited to using just MySQL, a relatively clean approach would be to write a bash script that pulls a list of all text fields with URL's in your table along with sufficient key information, do the RegEx in bash (perhaps via Perl or some other script language, if necessary), and then update back to MySQL.
Upvotes: 2