KMC
KMC

Reputation: 1742

Creating stacked chart

I have two tables that stores login attempts of users. One table contains all successful logins and the other contains fail attempts. I'm trying to create a stacked chart by using fail login counts and successful login counts. This is how my tables look like :

Success_login Table:

User_ID  Site_Address  Login_Attempts
1        xxx.xxx.xxx   5
2        xxx.xxy.yyy   10

Fail_login Table:

User_ID  Site_Address  Login_Attempts
1        xxx.xxx.xxx   2
2        xxx.xxy.yyy   8

How do I use Login_Attempts columns of those two tables to create stacked chart so that I can highlight success and failure attempt? I looked online and I found this code :

# Stacked Bar Plot with Colors and Legend
 counts <- table(mtcars$vs, mtcars$gear)
 barplot(counts, main="Car Distribution by Gears and VS",
 xlab="Number of Gears", col=c("darkblue","red"),
 legend = rownames(counts))

However, it does not work, as my two tables have different number of records. I appreciate if you could guide me to the solution.

Thanks

Upvotes: 2

Views: 178

Answers (2)

bgoldst
bgoldst

Reputation: 35314

Discussion

First you have to unify your data into a single table. This can be done with a kind of outer join, if you're familiar with SQL. See How to join (merge) data frames (inner, outer, left, right)?. The resulting NAs (for records which failed to join to the opposite table) must be replaced with zeroes in order for the final call to barplot() to work.

You must then derive a matrix in the format required by barplot() for producing stacked bar charts, which can be done pretty easily with a single call to matrix(). Taking care to set labels/titles/legends/colors correctly, you can get a nice stacked bar chart:

Code

s <- data.frame(User_ID=c(1,2,3), Site_Address=c('xxx.xxx.xxx','xxx.xxy.yyy','xxx.yyy.zzz'), Login_Attempts=c(5,10,3) );
f <- data.frame(User_ID=c(1,2,4), Site_Address=c('xxx.xxx.xxx','xxx.xxy.yyy','xxx.yyy.zzz'), Login_Attempts=c(2,8,4) );
all <- merge(s,f,by=c('User_ID','Site_Address'),suffixes=c('.successful','.failed'),all=T);
all[is.na(all)] <- 0;
stackData <- matrix(c(all$Login_Attempts.failed, all$Login_Attempts.successful ),2,byrow=T);
colnames(stackData) <- paste0(all$User_ID, '@', all$Site_Address );
rownames(stackData) <- c('failed','successful');
barplot(stackData,main='Successful and failed login attempts',xlab='User_ID@Site_Address',ylab='Login_Attempts',col=c('red','blue'),legend=rownames(stackData));

Resulting data

r> s;
  User_ID Site_Address Login_Attempts
1       1  xxx.xxx.xxx              5
2       2  xxx.xxy.yyy             10
3       3  xxx.yyy.zzz              3
r> f;
  User_ID Site_Address Login_Attempts
1       1  xxx.xxx.xxx              2
2       2  xxx.xxy.yyy              8
3       4  xxx.yyy.zzz              4
r> all;
  User_ID Site_Address Login_Attempts.successful Login_Attempts.failed
1       1  xxx.xxx.xxx                         5                     2
2       2  xxx.xxy.yyy                        10                     8
3       3  xxx.yyy.zzz                         3                     0
4       4  xxx.yyy.zzz                         0                     4
r> stackData;
           [email protected] [email protected] [email protected] [email protected]
failed                 2             8             0             4
successful             5            10             3             0

Output

bar-chart

References


Edit: It's a little strange to create a one-bar stacked bar chart, but ok, here's how you can do it, using the above data (all) as a base:

barplot(matrix(c(sum(all$Login_Attempts.failed),sum(all$Login_Attempts.successful))),main='Successful and failed login attempts',ylab='Login_Attempts',col=c('red','blue'),legend=c('failed','successful'));

one-bar-chart


Edit: Yeah, the y-axis should really cover the stack completely by default, it's a weakness in the base graphics package that it doesn't. You can add ylim=c(0,1.2*sum(do.call(c,all[,3:4]))) as an argument to the barplot() call to force the y-axis to extend at least 20% beyond the high point of the stack. (It's unfortunate that you have to calculate that manually from the input data, but as I said, it's a weakness in the package.)

Also, with regard to my comment about the oneness of the bar, it's just more common for stacked bar charts to be used to compare multiple bars, rather than showing a single bar. (That's why my initial assumption was that you wanted a separate bar for each user/site.) Instead of a single stacked bar, normally you'd see a plain old bar chart showing the different data points side-by-side. But it really depends on your application, so do what works best for you.

Upvotes: 1

MrGumble
MrGumble

Reputation: 5766

  1. Try drawing, by hand, the stacked chart you are trying to create. Does it even make sense?
  2. When convinced that you now know what your desired result should look like, by hand, create a single data.frame or matrix necessary for barplot to create your result. Remember to include special instances e.g. where a user only has successful or unsuccessful logins.
  3. Figure how to put your input data.frames together into the single data.frame in the previous step.

The result of step 2 is your reproducible example you need in order to ask a sensible question here. Step 3 is what you are asking here, but it does not seem you are sure what the intermediate result should look like. Step 1 is about visualising the end product, and working back from there.

Upvotes: 0

Related Questions