Phil
Phil

Reputation: 642

Create Multi-dimensional Data Mapping in R

If I want to represent a set of values in R that are keyed on 3 different dimensions, is there a simple/succinct way of generating this?

Say for example I have the following keys - each dimension must support having a different number of keys. In total the example below will reference 360 values (3*30*4):

rating <- c('AA','AAB','C')
timeInYears <- 1:30
monthsUntilStart <- c(1,3,6,12)

So I want to be able to access, for example, the value with a rating of AA, 7 years from now, starting in 12 month, using something like:

value <- data[rating=='AA',timeInYears==7,monthsUntilStart==12]

To start with I'd like to be able to provide sample generated values for every combination of keys.

In reality they will be read in from a database, but to get started it would be good to provide a dummy structure from a set of dummy values, that can simply be sequentially repeated over the structure.

So say we have

values <- c(2.30,2.32,1.98,2.18,2.29,2.22)

So each (x,y,z) key maps to one of these values.

Any hints or tips on how to best to approach this much appreciated!

Thanks!

Phil.

Upvotes: 1

Views: 129

Answers (1)

hatmatrix
hatmatrix

Reputation: 44962

You can use an array in R for this task.

First, we will create a data frame that includes all the possibilities. As desired, we will assign values that are cycled to the length of observations:

rating <- c('AA','AAB','C')
timeInYears <- 1:30
monthsUntilStart <- c(1,3,6,12)

data <- expand.grid(rating=rating, timeInYears=timeInYears, monthsUntilStart=monthsUntilStart)
data$value <- c(2.30,2.32,1.98,2.18,2.29,2.22) # cycles through

Next, we convert to an array:

dataarray <- unclass(by(data[["value"]], data[c("rating", "timeInYears", "monthsUntilStart")], identity))

Note that integers will be converted to character strings.

> dimnames(dataarray)
$rating
[1] "AA"  "AAB" "C"  

$timeInYears
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
[16] "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30"

$monthsUntilStart
[1] "1"  "3"  "6"  "12"

You can access your desired element by index (it will return the random value that was assigned for this example).

> dataarray["AA", "7", "12"]
[1] 2.3

Edit

You can also just use the data frame itself, if you wish.

> subset(data, rating=='AA' & timeInYears==7 & monthsUntilStart==12)

    rating timeInYears monthsUntilStart value
289     AA           7               12   2.3
> subset(data, rating=='AA' & timeInYears==7 & monthsUntilStart==12, value)

    value
289   2.3
> subset(data, rating=='AA' & timeInYears==7 & monthsUntilStart==12)$value
[1] 2.3

Upvotes: 3

Related Questions