stevec
stevec

Reputation: 52268

Get a list of the all the names of the objects in the datasets R package?

How can I get a list of the exact names of the objects in the datasets package?

I found many of them here:

data_package = data(package="datasets")
datasets <- as.data.frame(data_package[[3]])$Item
datasets

#   [1] "AirPassengers"          "BJsales"                "BJsales.lead (BJsales)" "BOD"                    "CO2"                    "ChickWeight"           
#   [7] "DNase"                  "EuStockMarkets"         "Formaldehyde"           "HairEyeColor"           "Harman23.cor"           "Harman74.cor"          
#  [13] "Indometh"               "InsectSprays"           "JohnsonJohnson"         "LakeHuron"              "LifeCycleSavings"       "Loblolly"              
#  [19] "Nile"                   "Orange"                 "OrchardSprays"          "PlantGrowth"            "Puromycin"              "Seatbelts"             
#  [25] "Theoph"                 "Titanic"                "ToothGrowth"            "UCBAdmissions"          "UKDriverDeaths"         "UKgas"                 
#  [31] "USAccDeaths"            "USArrests"              "USJudgeRatings"         "USPersonalExpenditure"  "UScitiesD"              "VADeaths"              
#  [37] "WWWusage"               "WorldPhones"            "ability.cov"            "airmiles"               "airquality"             "anscombe"              
#  [43] "attenu"                 "attitude"               "austres"                "beaver1 (beavers)"      "beaver2 (beavers)"      "cars"                  
#  [49] "chickwts"               "co2"                    "crimtab"                "discoveries"            "esoph"                  "euro"                  
#  [55] "euro.cross (euro)"      "eurodist"               "faithful"               "fdeaths (UKLungDeaths)" "freeny"                 "freeny.x (freeny)"     
#  [61] "freeny.y (freeny)"      "infert"                 "iris"                   "iris3"                  "islands"                "ldeaths (UKLungDeaths)"
#  [67] "lh"                     "longley"                "lynx"                   "mdeaths (UKLungDeaths)" "morley"                 "mtcars"                
#  [73] "nhtemp"                 "nottem"                 "npk"                    "occupationalStatus"     "precip"                 "presidents"            
#  [79] "pressure"               "quakes"                 "randu"                  "rivers"                 "rock"                   "sleep"                 
#  [85] "stack.loss (stackloss)" "stack.x (stackloss)"    "stackloss"              "state.abb (state)"      "state.area (state)"     "state.center (state)"  
#  [91] "state.division (state)" "state.name (state)"     "state.region (state)"   "state.x77 (state)"      "sunspot.month"          "sunspot.year"          
#  [97] "sunspots"               "swiss"                  "treering"               "trees"                  "uspop"                  "volcano"               
# [103] "warpbreaks"             "women" 

So something like this would iterate through each one

for(i in 1:length(datasets)) {
  print(get(datasets[i]))
  cat("\n\n")
}

It works for the first two datasets (AirPassengers and BJsales), but it fails on BJsales.lead (BJsales) since it should be referred to as datasets::BJsales.lead.

I guess I could use string split or similar to discard anything from a space onwards, but I wonder is there any neater way of obtaining a list of all the objects in the dataset package?

Notes

ls(getNamespace("datasets"), all.names=TRUE)
# [1] ".__NAMESPACE__."      ".__S3MethodsTable__." ".packageName" 

Upvotes: 2

Views: 210

Answers (1)

MrFlick
MrFlick

Reputation: 206232

There is a note on the ?data help page that states

Where the datasets have a different name from the argument that should be used to retrieve them the index will have an entry like beaver1 (beavers) which tells us that dataset beaver1 can be retrieved by the call data(beavers).

So the actual object name is the thing before the parentheses at the end. Since that value is returned as just a string, that's something you'll need to remove yourself unfortunately. But you can do that with a gsub

datanames <- data(package="datasets")$results[,"Item"]
objnames <- gsub("\\s+\\(.*\\)","", datanames)

for(ds in objnames) {
  print(get(ds))
  cat("\n\n")
}

Upvotes: 4

Related Questions