Reputation: 15
So basically, I've created a big dendogram in RStudio and I've already tried many things out. I've tried to plot the names vertically, I've tried to give my dataframe-columns simple names like 1,2,3,..,11 and so on. But I can't figure out, why I'm getting these odd black bars?! I Can't see the names of my Variables.. Do you have any clue?
Upvotes: 0
Views: 363
Reputation: 4309
At the bottom of your dendogram, you have all the identifiers you used in your clustering. When you have a lot of identifiers, then you can't see them all because they are "stacked" next to each others. This is what produced the "black bars".
library(cluster)
d = daisy(mtcars)
hc = hclust(as.dist(d), method = "ward")
There is little you can do about this. However, you can try to manipulate the cex
argument.
plot(hc, cex = 0.5)
Here I reduced the size of the identifier.
One solution in order to retrieve the identifiers is to do this.
Let us imagine that we choose a solution of 3
clusters.
clusters = cutree(hc, k = 3)
Then you can do
dt = as.data.frame(clusters)
dt$carsID = row.names(dt)
library(dplyr)
dt %>% arrange(clusters)
clusters carsID
1 1 Mazda RX4
2 1 Mazda RX4 Wag
3 1 Datsun 710
4 1 Merc 240D
5 1 Merc 230
6 1 Merc 280
7 1 Merc 280C
8 1 Fiat 128
9 1 Honda Civic
10 1 Toyota Corolla
11 1 Toyota Corona
12 1 Fiat X1-9
13 1 Porsche 914-2
14 1 Lotus Europa
15 1 Ferrari Dino
16 1 Volvo 142E
17 2 Hornet 4 Drive
18 2 Valiant
19 2 Merc 450SE
20 2 Merc 450SL
21 2 Merc 450SLC
22 2 Dodge Challenger
23 2 AMC Javelin
24 3 Hornet Sportabout
25 3 Duster 360
26 3 Cadillac Fleetwood
27 3 Lincoln Continental
28 3 Chrysler Imperial
29 3 Camaro Z28
30 3 Pontiac Firebird
31 3 Ford Pantera L
32 3 Maserati Bora
Then you can see all the identifiers.
Upvotes: 1