Reputation: 4346
I have a dataset, d, that contains personally identifiable data, I have the dataset putting an X for all values that are suppressed:
column1 column2 column3
* FSM X
* Male 2.5
* Female X
A FSM 6
A Male 10.3
A Female 11.7
B FSM 14.8
B Male 21.5
B Female 25.3
I want to plot this with an X above the bars in a bar plot, where data has been suppressed, such as:
My code is:
p <- ggplot(d, aes(x=column1, y=column3, fill=column2)) +
geom_bar(position=position_dodge(), stat="identity", colour="black") +
geom_text(aes(label=column2),position= position_dodge(width=0.9), vjust=-.5)
scale_y_continuous("Percentage",breaks=seq(0, max(d$column3), 2)))
But of course, it can't plot 'X' on the graph and says:
Error: Discrete value supplied to continuous scale
How can I get the bar plotting to ignore the 'X' and still add the label if it's present?
Data dump:
structure(list(column1 = structure(c(1L, 1L, 1L, 2L, 2L, 2L,
3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L), .Label = c("*",
"A", "B", "C", "D", "E", "U"), class = "factor"), column2 = structure(c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L), .Label = c("FSM", "Male", "Female"), class = "factor"),
column3 = structure(c(21L, 1L, 2L, 18L, 3L, 4L, 7L, 12L,
14L, 16L, 15L, 13L, 10L, 9L, 8L, 11L, 6L, 5L, 20L, 19L, 17L
), .Label = c("1.93889541715629", "1.97444831591173", "10.1057579318449",
"11.7305458768873", "12.7758420441347", "14.4535840188014",
"14.8471615720524", "18.5830429732869", "19.9764982373678",
"20.0873362445415", "20.9606986899563", "21.5628672150411",
"24.1579558652729", "25.3193960511034", "25.7931844888367",
"29.2576419213974", "5.45876887340302", "6.11353711790393",
"6.16921269095182", "6.98689956331878", "X"), class = "factor")), .Names = c("column1",
"column2", "column3"), row.names = c(NA, -21L), class = "data.frame")
I 'm happy to print out 0 instances where there are 0 instances, but in the case of data suppression, I want to make it clear that data has been suppressed by printing out a 'X', but the bar will also show 0 instances
Upvotes: 0
Views: 173
Reputation: 94182
First convert the height to numeric which gives NA for censored values. Then create a label column based on that. Then you need a column of zeroes for the y coordinate of the labels.
> d$column3=as.numeric(as.character(d$column3))
Warning message:
NAs introduced by coercion
> d$column4 = ifelse(is.na(d$column3),"X","")
> d$y=0
Then:
> p <- ggplot(d, aes(x=column1, y=column3, fill=column2))
> p + geom_bar(position=position_dodge(), stat="identity",
colour="black") +
geom_text(aes(label=column4,x=column1,y=y),
position=position_dodge(width=1), vjust=-0.5)
Giving:
Its a variant on labelling a geom_bar
with the value of the bar. Almost a dupe.
Upvotes: 3