linp
linp

Reputation: 1517

how to make a pie graph only name top n performance

I haven't been using pie graph a lot in r, is there a way to make a pie graph and only show the top 10 names with percentage?

For example, here's a simple version of my data:

> data
   count    METRIC_ID
1      8           71
2      2         1035
3      5         1219
4      4         1277
5      1         1322
6      3         1444
7      5         1462
8     17         1720
9      6         2019
10     2         2040
11     1         2413
12    11         2489
13    24         2610
14    29         2737
15     1         2907
16     1         2930
17     2         2992
18     1         2994
19     2         3020
20     4         3045
21    35         3222
22     2         3245
23     5         3306
24     2         3348
25     2         3355
26     2         3381
27     3         3383
28     4         3389
29     6         3404
30     1         3443
31    22         3465
32     3         3558
33    15         3600
34     3         3730
35     6         3750
36     1         3863
37     1         3908
38     5         3913
39     3         3968
40     9         3972
41     2         3978
42     5         4077
43     4         4086
44     3         4124
45     2         4165
46     3         4205
47     8         4206
48     4         4210
49    12         4222
50     4         4228

and I want to see the count of each METRIC_ID's distribution:

pie(data$count, data$METRIC_ID)

But this Chart marks every single METRIC_ID on the graph, when I have over 100 METRIC_ID, it looks like a mess. How can I only mark the top n (for example, n=5) METRIC_ID on the graph, and show the count of that n METRIC_ID only?

Thank you for your help!!!

Upvotes: 0

Views: 2759

Answers (3)

Paul Hiemstra
Paul Hiemstra

Reputation: 60944

Simply subset your data before creating the piechart. I'd do somehting like:

  1. Sort your datasets using order.
  2. Select the first ten rows.
  3. Create the pie chart from the resulting data.

Pie charts are not the best way to visualize your data, just google pie chart problems, e.g. this link. I'd go for something like:

library(ggplot2)
dat = dat[order(-dat$count),]
dat = within(dat, {METRIC_ID = factor(METRIC_ID, levels = METRIC_ID)})
ggplot(dat, aes(x = METRIC_ID, y = count)) + geom_point()

enter image description here

Here I just plot all the data, which I think still leads to a readable graph. This graph is more formally known as a dotplot, and is heavily used in the graphics book of Cleveland. Here the height is linked to count, which is much easier to interpret that linking count to the fraction of the area of a circle, as in the case of the piechart.

Upvotes: 3

sieste
sieste

Reputation: 9027

To suppress plotting of some labels, set them to NA. Try this:

labls <- data$METRIC_ID
labls[data$count <  3] <- NA
pie(data$count, paste(labls))

Upvotes: 3

Roland
Roland

Reputation: 132706

Find a better type of chart for your data.

Here is a possibility to create the chart you want:

data2 <- data[data$count %in% tail(sort(data$count),5),]
pie(data2$count, data2$METRIC_ID)

enter image description here

Slightly better:

data3 <- data2
data3$METRIC_ID <- as.character(data3$METRIC_ID)
data3 <- rbind(data3,data.frame(count=sum(data[! data$count %in% tail(sort(data$count),5),"count"]),METRIC_ID="others"))
pie(data3$count, data3$METRIC_ID)

enter image description here

Upvotes: 2

Related Questions