user9292
user9292

Reputation: 1145

Plot a large actual vs. expected data in R

My dataset contains three columns: ID (N= 1000), expected score, and actual score. The score can be 1, 2, 3, or 4.

Here is a simulated data

ID <- seq(from = 1, to = 1000, by=1)
actual <- round(runif(1000, min=1, max=4))
expected <- round(runif(1000, min=1, max=4))
mydata <- data.frame(ID, actual, expected)

We can easily create a contingency table using

table(mydata$actual, mydata$expected)

I need to create a plot this data for each ID. So imagine the plot will be a matrix of 1000 times 1000.

If Actual=Expected, the color of these cells will be white
If Actual < Expected, the color of these cells will be red
If Actual > Expected, the color of these cells will be blue

Upvotes: 1

Views: 685

Answers (1)

Alexey Ferapontov
Alexey Ferapontov

Reputation: 5169

There is only one ID per pair of actual and expected, so it will be a linear graph. You don't want to plot actual and expected values, right?

ID <- seq(from = 1, to = 1000, by=1)
actual <- round(runif(1000, min=1, max=4))
expected <- round(runif(1000, min=1, max=4))
mydata <- data.frame(ID, actual, expected)
View(mydata)
t = table(mydata$actual, mydata$expected)
attach(mydata)
col1 = ifelse(actual == expected , "white", ifelse(actual < expected, "red", "blue")) 
plot(ID,col=col1)

enter image description here

But if you want a 4x4 matrix with colors and boxes that represent frequencies, you can do that:

plot(t,col=col1) 

enter image description here

Edit. I guess, what you want is a map of ANY actual vs ANY expected? This can be done in a more elegant way, but due to lack of time I cannot provide a full solution with your desired colors. Here's a quick solution with basic colors (but color scheme is also coded in). Suppose, you have N=5.

set.seed(12345)
ID <- seq(from = 1, to = 5, by=1)
actual <- round(runif(5, min=1, max=4))
expected <- round(runif(5, min=1, max=4))
mydata <- data.frame(ID, actual, expected)

> mydata
  ID actual expected
1  1      3        1
2  2      4        2
3  3      3        3
4  4      4        3
5  5      2        4

colID = matrix("",5,5)
arr = matrix(0,5,5)
for (i in 1:5) {
  for (j in 1:5) {
    colID[i,j] = ifelse(actual[i] == expected[j] , "green", ifelse(actual[i] < expected[j], "red", "blue")) 
    arr[i,j] = ifelse(actual[i] == expected[j] , 1, ifelse(actual[i] < expected[j], 2, 3)) 
  }  
}

> arr
     [,1] [,2] [,3] [,4] [,5]
[1,]    3    3    1    1    2
[2,]    3    3    3    3    1
[3,]    3    3    1    1    2
[4,]    3    3    3    3    1
[5,]    3    1    2    2    2
> colID
     [,1]   [,2]    [,3]    [,4]    [,5]   
[1,] "blue" "blue"  "green" "green" "red"  
[2,] "blue" "blue"  "blue"  "blue"  "green"
[3,] "blue" "blue"  "green" "green" "red"  
[4,] "blue" "blue"  "blue"  "blue"  "green"
[5,] "blue" "green" "red"   "red"   "red"  

> image(arr)

enter image description here

Logic - create an array of NxN with 3 levels of either custom colors, or custom integers (1, 2, 3) and plot it as an image. Time permitting, I will try to make colors custom in image, but cannot guarantee.

Upvotes: 1

Related Questions