Tyler Rinker
Tyler Rinker

Reputation: 110024

replace all text urls with html url

I have a text file containing urls that 'd like to replace with a tags that open into a new tab. I'm converting the .txt file to a .md file and want clickable links.

I have shown below (1) a MWE, (2) desired output (3) my initial attempts to create a function (I assume this will/could take gsub and the sprintf function to achieve):

MWE:

x <- c("content here: http://stackoverflow.com/", 
    "still more", 
    "http://www.talkstats.com/ but also http://www.r-bloggers.com/", 
    "http://htmlpreview.github.io/?https://github.com/h5bp/html5-boilerplate/blob/master/404.html"
)

** Desired output:**

> x
[1] "content here: <a href="http://stackoverflow.com/" target="_blank">http://stackoverflow.com/</a>"                                                     
[2] "still more"                                                                                  
[3] "<a href="http://www.talkstats.com/" target="_blank">http://www.talkstats.com/</a> but also <a href="http://www.r-bloggers.com/" target="_blank">http://www.r-bloggers.com/</a>"                               
[4] "<a href="http://htmlpreview.github.io/?https://github.com/h5bp/html5-boilerplate/blob/master/404.html" target="_blank">http://htmlpreview.github.io/?https://github.com/h5bp/html5-boilerplate/blob/master/404.html</a>"

Initial attempt to solve:

repl <- function(x) sprintf("<a href=\"%s\" target=\"_blank\">%s</a>", x, x)
gsub("http.", repl(), x)

One corner case for using "http.\\s" as the regex is that the string may not end in a space as in x[3] or the url is contains to http which wouldn't only want to parse one time (as seen in x[4]).

PLEASE NOTE THAT R's REGEX IS SPECIFIC TO R;
ANSWERS FROM OTHER LANGUAGES ARE NOT LIKELY TO WORK

Upvotes: 2

Views: 106

Answers (1)

janos
janos

Reputation: 124784

This works with your sample x, and using your repl method:

gsub("(http://[^ ]*)", repl('\\1'), x)

or without your repl method:

gsub("(http://[^ ]*)", '<a href="\\1" target="_blank">\\1</a>', x)

Upvotes: 5

Related Questions