Reputation: 7879
Is there a way to convert an html file, such as https://cran.r-project.org/web/packages/tidytext/vignettes/tidytext.html, and convert it to an executable R Markdown file (rmd)?
Upvotes: 5
Views: 15844
Reputation: 1206
You can get a 98% result by:
To get the last 2%, you will want ensure R code chunks are recognised:
<!-- -->
with ```{r}
and after finish the code chunks with ```
And ensure data is available as required by the code. Good luck!
*To switch into visual mode for a markdown document, use the button with the compass icon at the top-right of the editor toolbar - described here: https://blog.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/
Upvotes: 6
Reputation: 506
:~$ ## convert .html to .md :
:~$ pandoc Assessment-Week2B.html -o Assessment-Week2B.md
:~$
:~$ ## rename .md to .rmd
:~$ mv Assessment-Week2B.md Assessment-Week2B.rmd
:~$
:~$ ## edit via RStudio
:~$ rstudio Assessment-Week2B.Rmd
Tried to modify via terminal as below short MV, but modify via RStudio will be easier.
Upvotes: 1
Reputation: 71
Here is the solution I use:
pandoc ./test.html -o test.md
mv test.md test.rmd
# chunks r marker: replace ' {\.sourceCode \.r}' by '{r}'
sed -i 's/ {\.sourceCode \.r/{r/' test.rmd
# delete lines beginning wit ':::'
sed -i '/^:::/d' test.rmd
# delete lines beginning '![](data:image' (html plot)
sed -i '/^\!\[\](data:image/d' test.rmd
# delete paragraph separator lines
sed -i '/^=====/d' test.rmd
sed -i '/^-----/d' test.rmd
# replace paragraph marks
#'[1]{.header-section-number}' by '#'
sed -i 's/\[[0-9]\+\]{\.header-section-number}/#/' test.rmd
#'[1.1]{.header-section-number}' by '##'
sed -i 's/\[[0-9]\+\.[0-9]\+\]{\.header-section-number}/##/' test.rmd
#'[1.1.1]{.header-section-number}' by '###'
sed -i 's/\[[0-9]\+\.[0-9]\+\\.[0-9]\+]{\.header-section-number}/###/' test.rmd
echo "$(echo -e "\n" | cat - test.rmd)" > test.rmd
echo "$(echo '---' | cat - test.rmd)" > test.rmd
echo "$(echo 'title: '\"'test'\" | cat - test.rmd)" > test.rmd
echo "$(echo '---' | cat - test.rmd)" > test.rmd
Of course you can have these lines in a .sh to simplify the task
Upvotes: 7
Reputation: 368241
In short, no.
The pandoc
binary is almost pure awesomeness, and I use it eg to convert the html
output from an Rd
file back into markdown (to be included in other markdown documents).
But that uses pandoc
for what it knows: convert from markdown to html etc. pandoc
itself knows nothing about R. So apart from the metaphysical difficulty of getting the code back from the output it created, you have a tool mismatch.
So in some: you probably want the original source code as you cannot recreate Rmd from the html output it produces.
Upvotes: 5
Reputation: 269586
If a markdown file (.md
) is sufficient then download and install pandoc
if you don't already have it. Then run this from the commmand line or use system("pandoc ...")
or shell("pandoc ...")
from within R.
pandoc https://cran.r-project.org/web/packages/tidytext/vignettes/tidytext.html -o out.md
For a particular file, it would be possible to post-process the source code and output sections but would represent some additional effort, possibly substantial.
Upvotes: 5