Reputation: 3555
I have some text in which I want to replace with an actual link.
The text looks like this:
Some text here
[...]
- CRAN Task View: [Bayesian](Bayesian.html)
- CRAN Task View: [Cluster](Cluster.html)
- CRAN Task View: [Databases](Databases.html)
- CRAN Task View: [Environmetrics](Environmetrics.html)
[...]
End of text here
But as you can see, there is no HTML link to the pages. E.g., Bayesian.html should be http://cran.rstudio.com/web/views/Bayesian.html
The final result should be
Some text here
[...]
- CRAN Task View: [Bayesian](http://cran.rstudio.com/web/views/Bayesian.html)
- CRAN Task View: [Cluster](http://cran.rstudio.com/web/views/Cluster.html)
- CRAN Task View: [Databases](http://cran.rstudio.com/web/views/Databases.html)
- CRAN Task View: [Environmetrics](http://cran.rstudio.com/web/views/Environmetrics.html)
[...]
End of text here
So far, I was able to "subset" my text file using the following command:
grep "CRAN Task View: \[" $FILE
But when I try to pipe to this:
sed -e 's|\\([a-zA-Z]*\\)\\.html|http://cran.rstudio.com/web/views/\\1.html|'
It doesn't work. How would it be possible to sed inline from the grep command?
I'm on macOS Mojave.
Upvotes: 1
Views: 164
Reputation: 785481
This sed
should work for you:
sed -E '/CRAN Task View:/s~\(([^)]+)\)~(http://cran.rstudio.com/web/views/\1)~' file
Some text here
[...]
- CRAN Task View: [Bayesian](http://cran.rstudio.com/web/views/Bayesian.html)
- CRAN Task View: [Cluster](http://cran.rstudio.com/web/views/Cluster.html)
- CRAN Task View: [Databases](http://cran.rstudio.com/web/views/Databases.html)
- CRAN Task View: [Environmetrics](http://cran.rstudio.com/web/views/Environmetrics.html)
[...]
End of text here
RegEx Details:
/CRAN Task View:/
: Only if line matches text "CRAN Task View:"
s~
: Substitute\(
: Match a (
([^)]+)
: Match 1+ non-)
characters in capture group #1\)
: Match a )
(http://cran.rstudio.com/web/views/\1)
is replacement that creates a link using back-reference #1Upvotes: 4
Reputation: 27245
sed -e 's|\\([a-zA-Z]*\\)\\.html|http://cran.rstudio.com/web/views/\\1.html|'
It doesn't work.
This is a quoting issue. Inside single quotes '...'
backslashes \
need no escaping. Bash parses '\\('
as \\(
and sends it to sed
which interprets it as the literal string \(
. Therefore, you are replacing the literal string " \(
someLetters\)\.html
" which never occurs in your file.
You probably meant sed 's|\([a-zA-Z]*\)\.html|http://cran.rstudio.com/web/views/\1.html|'
.
By the way: sed
can also do the grep
part for you. Also, with -E
you need less backslashes. But since you append the .html
again, you don't need the group \(....\)
in the first place.
sed -E -n '/CRAN Task View: \[/s|[a-zA-Z]*\.html|http://cran.rstudio.com/web/views/&|p'
Upvotes: 1