Reputation: 48392
I'm going through a DataCamp class on dplyr. They had me load the 'hflights' data and then asked me to create a new column named 'Carrier', substituting each airline code with their actual name. The solution looks as follows:
hflights <- tbl_df(hflights)
names <- c("AA" = "American", "AS" = "Alaska", "B6" = "JetBlue", "CO" = "Continental",
"DL" = "Delta", "OO" = "SkyWest", "UA" = "United", "US" = "US_Airways",
"WN" = "Southwest", "EV" = "Atlantic_Southeast", "F9" = "Frontier",
"FL" = "AirTran", "MQ" = "American_Eagle", "XE" = "ExpressJet", "YV" = "Mesa")
hflights["Carrier"] <- names[hflights$UniqueCarrier]
I figured out how to do this, and this works, but it's not real clear to me exactly what R is doing here. I understand I'm adding a new column to the hflights data frame but I'm not clear on how (or why) R is substituting carrier codes for carrier names.
Upvotes: 1
Views: 318
Reputation: 38500
This is a look up table where the names of a named vector are being used to return the values within that vector. To provide a couple of examples:
As a reminder, it is possible to subset a named vectors both by referring to the index or the name:
names[1:2]
AA AS
"American" "Alaska"
names[c("AA", "AS")]
AA AS
"American" "Alaska"
A nice feature is that these references can be repeated to produce an extended vector:
names[rep(1:2, 2)]
AA AS AA AS
"American" "Alaska" "American" "Alaska"
names[rep(c("AA", "AS"), 2)]
AA AS AA AS
"American" "Alaska" "American" "Alaska"
Using this method, it is possible to use a vector containing either indices of the look up table or names of the look up table to produce a vector of the same length, but with the desired values.
Upvotes: 3
Reputation: 668
names
is a named vector of type character
or string
. This is similar to a Python dictionary, where each string
indexes a variable. In this case, you index by the carrier code and the value is the full name.
In R
, when you index a vector, you can do so with a list. In this case you are indexing the "dictionary" with the abbreviation codes and it returns a list the length of the index matching their values.
Upvotes: 2