Tyler Rinker
Tyler Rinker

Reputation: 109874

Search for packages by a particular author

Sometimes I get accustomed to a particular R package's design and want to search CRAN for all packages by that author (let's use Hadley Wickham for instance). How can I do such a search (I'd like to use R but this doesn't have to be the mode of search)?

Upvotes: 18

Views: 2820

Answers (4)

Ben Bolker
Ben Bolker

Reputation: 226247

Adapted from available.packages by publication date :

## restrict to first 100 packages (by alphabetical order)
pkgs <- unname(available.packages()[, 1])[1:100]
desc_urls <- paste0(options("repos")$repos,"/web/packages/", pkgs, 
    "/DESCRIPTION")
desc <- lapply(desc_urls, function(x) read.dcf(url(x)))
authors <- sapply(desc, function(x) x[, "Author"])

Since I'm a narcissist (and Hadley Wickham has no packages in the first 100 [this was true in 2012 but cannot possibly be true now, in 2018!]):

pkgs[grep("Bolker",authors)]
# [1] "ape"

The main problem with this solution is that doing it for real (rather than just for the first 100 packages) means hitting CRAN 3000+ times for the package information ...

edit: a better solution, based on Jeroen Oom's solution in the same place:

recent.packages.rds <- function(){
    mytemp <- tempfile()
    download.file(paste0(options("repos")$repos,"/web/packages/packages.rds"),
                  mytemp)
    mydata <- as.data.frame(readRDS(mytemp), row.names=NA)
    mydata$Published <- as.Date(mydata[["Published"]])
    mydata
}

mydata <- recent.packages.rds()
unname(as.character(mydata$Package[grep("Wickham",mydata$Author)]))
# [1] "classifly"    "clusterfly"   "devtools"     "evaluate"     "fda"         
# [6] "geozoo"       "ggmap"        "ggplot2"      "helpr"        "hints"       
# [11] "HistData"     "hof"          "itertools"    "lubridate"    "meifly"      
# [16] "memoise"      "munsell"      "mutatr"       "normwhn.test" "plotrix"     
# [21] "plumbr"       "plyr"         "productplots" "profr"        "Rd2roxygen"  
# [26] "reshape"      "reshape2"     "rggobi"       "roxygen"      "roxygen2"    
# [31] "scales"       "sinartra"     "stringr"      "testthat"     "tourr"       
# [36] "tourrGui"  

Upvotes: 11

Waldir Leoncio
Waldir Leoncio

Reputation: 11341

Bolker's solution above is quite quick and still works, but since 2018 there's a package called pkgsearch that outputs more complete information. Here's a demo, continuing the trend of shameless self-promotion:

r$> pkgsearch::advanced_search(Author = "Waldir", size = 100)                                                                                                                               
- "advanced search" --------------------------------------------------------------------- 11 packages in 0.001 seconds -
  #     package           version by                     @ title                                                                          
  1 100 matlab2r          1.0.0   Waldir Leoncio        1M Translation Layer from MATLAB to R                                             
  2 100 simExam           1.0.0   Waldir Leoncio        3y Generate Simulated Data for IRT-Enabled Exams                                  
  3  83 citation          0.6.2   Jan Philipp Dietrich  1M Software Citation Tools                                                        
  4  83 LOGAN             1.0.0   Denise Reis Costa     3y Log File Analysis in International Large-Scale Assessments                     
  5  82 TruncExpFam       1.0.0   Waldir Leoncio        7d Truncated Exponential Family                                                   
  6  61 contingencytables 1.0.0   Waldir Leoncio        1M Statistical Analysis of Contingency Tables                                     
  7  60 DIscBIO           1.2.0   Waldir Leoncio       10M A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
  8  51 BayesSUR          2.0.1   Zhi Zhao              3M Bayesian Seemingly Unrelated Regression                                        
  9  44 lsasim            2.1.2   Waldir Leoncio        4M Functions to Facilitate the Simulation of Large Scale Assessment Data          
 10  39 BayesMallows      1.1.0   Oystein Sorensen      3M Bayesian Preference Learning with the Mallows Rank Model                       
 11  11 xaringan          0.22    Yihui Xie             8M Presentation Ninja   

Notice I had to increase size from the default of 10 otherwise I wouldn't get all the packages.

For comparison with the output on the aforementioned answer:

r$> unname(as.character(mydata$Package[grep("Waldir",mydata$Author)]))                        
 [1] "BayesMallows"      "BayesSUR"          "citation"          "contingencytables" "DIscBIO"           "LOGAN"             "lsasim"            "matlab2r"          "simExam"          
[10] "TruncExpFam"       "xaringan"

Upvotes: 2

IRTFM
IRTFM

Reputation: 263352

Not exactly by author but perhaps access by maintainer would also be useful?

http://cran.r-project.org/web/checks/check_summary_by_maintainer.html#summary_by_maintainer

EDIT by Tyler Rinker

DWin's suggestion can be brought to fruition with these lines of code:

search.lib <- function(term, column = 1){
    require(XML)
    URL <- "http://cran.r-project.org/web/checks/check_summary_by_maintainer.html#summary_by_maintainer"
    dat <-readHTMLTable(doc=URL, which=1, header=T, as.is=FALSE)
    names(dat) <- trimws(names(dat))
    dat$Maintainer[dat$Maintainer == ""] <- NA
    dat$Maintainer = zoo::na.locf(dat$Maintainer)
    if (is.numeric(column)) {
        dat[agrep(term, dat[, column]), 1:3]
    } else {
        dat[agrep(term, dat[, agrep(column, colnames(dat))]), 1:3]
    }
}

search.lib("hadley")
search.lib("bolker")
search.lib("brewer", 2)

Upvotes: 14

Dason
Dason

Reputation: 61933

Crantastic can search by author. You can do quite a bit more with crantastic but the functionality you're looking for is already provided there.

Upvotes: 14

Related Questions