Reputation: 33303
I just implemented a hierarchical clustering by following the documentation here: http://www.mathworks.com/help/stats/hierarchical-clustering.html?s_tid=doc_12b
So, let me try to put down what I am trying to do. Take a look at the following figure:
Now, this dendogram is generated from the following data:
node1 node2 dist(node1,node2) num_elems
assigning index **37 to [ 16. 26**. 1.14749118 2. ]
assigning index 38 to [ 4. 7. 1.20402602 2. ]
assigning index 39 to [ 13. 29. 1.44708015 2. ]
assigning index 40 to [ 12. 18. 1.45827365 2. ]
assigning index 41 to [ 10. 34. 1.49607538 2. ]
assigning index 42 to [ 17. 38. 1.52565922 3. ]
assigning index 43 to [ 8. 25. 1.58919037 2. ]
assigning index 44 to [ 3. 40. 1.60231007 3. ]
assigning index 45 to [ 6. 42. 1.65755731 4. ]
assigning index 46 to [ 15. 23. 1.77770844 2. ]
assigning index 47 to [ 24. 33. 1.77771082 2. ]
assigning index 48 to [ 20. 35. 1.81301111 2. ]
assigning index 49 to [ 19. 48. 1.9191061 3. ]
assigning index 50 to [ 0. 44. 1.94238609 4. ]
assigning index 51 to [ 2. 36. 2.0444266 2. ]
assigning index 52 to [ 39. 45. 2.11667375 6. ]
assigning index 53 to [ 32. 43. 2.17132916 3. ]
assigning index 54 to [ 21. 41. 2.2882061 3. ]
assigning index 55 to [ 9. 30. 2.34492327 2. ]
assigning index 56 to [ 5. 51. 2.38383321 3. ]
assigning index 57 to [ 46. 52. 2.42100025 8. ]
assigning index 58 to [ **28. 37**. 2.48365024 3. ]
assigning index 59 to [ 50. 53. 2.57305009 7. ]
assigning index 60 to [ 49. 57. 2.69459675 11. ]
assigning index 61 to [ 11. 54. 2.75669475 4. ]
assigning index 62 to [ 22. 27. 2.77163751 2. ]
assigning index 63 to [ 47. 55. 2.79303418 4. ]
assigning index 64 to [ 14. 60. 2.88015327 12. ]
assigning index 65 to [ 56. 59. 2.95413905 10. ]
assigning index 66 to [ 61. 65. 3.12615829 14. ]
assigning index 67 to [ 64. 66. 3.28846304 26. ]
assigning index 68 to [ 31. 58. 3.3282066 4. ]
assigning index 69 to [ 63. 67. 3.47397104 30. ]
assigning index 70 to [ 62. 68. 3.63807605 6. ]
assigning index 71 to [ 1. 69. 4.09465969 31. ]
assigning index 72 to [ 70. 71. 4.74129435 37.
So basically, there are 37 points in my data same indexed from 0-36..Now, when I see the first element in this list... I assign i + len(thiscompletelist) + 1
So for example, when the id is 37 seen again in future iterations, then that basically means that it is linked to a branch as well.
I used matlab to generate this image. But I want to query this information as query_node(node_id)
such that it returns me a list by level.. such that... on query_node(37)
I get
{ "left": {"level":1 {"id": 28}} , "right":{"level":0 {"left" :"id":16},"right":{"id":26}}}
Actually.. I dont even know what is the right data structure to do this.. Basically I want to query by node and gain some insight on what does the structure of this dendogram looks like when I am standing on that node and looking below. :(
EDIT 1:
*OOH I didn't knew that you wont be able to zoom the image.. basically the fourth element from the left is 28 and the green entry is the first row of the data..
So fourth vertical line on dendogram represents 28
Next to that line (the first green line) represents 16
and next to that line (the second green line) represents 26*
Upvotes: 0
Views: 102
Reputation: 1586
Well it's always good to build upon something already existing so take a look at dendrogram in scipy.
Upvotes: 2