sidra
sidra

Reputation: 21

Clustering techniques for Binary Data

I want to use clustering techniques for binary data analysis. I have collected the data through survey in which i asked the users to select exactly 20 features out of list of 94 product features. The columns in my data represents the 94 product features and the rows represents the participants. I am trying to cluster the similar users in different user groups based on the product features they selected. Each user cluster should also tell me the product features associated with each cluster. I am using some open source clustering tools like NCSS and JMP. I was trying to use fuzzy clustering technique for achieveing my goal but unfortunately these tools do not deal with binary data. Can you please suggest me which technique would really be appropriate for my tasks , also which online tool i can use for using the cluster analysis on my data? As beacuse of the time limitation, I am not looking to code myself and i am only looking for some open source tools that have all the functionality available in them which i can use as it is.

Upvotes: 2

Views: 1523

Answers (1)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77505

Clustering for binary data is not really well defined.

Rather than looking for some tool/function that may or may not work by trial and error, you should first try to answer a 'simple" question:

What is a good cluster, mathematically?

Vague terms not allowed. The next questions to answer then are: I) when is clustering A better than clustering B (I.e. how does the computer compute quality), and ii) how can this be found efficiently.

You won't get far if you don't understand what you are doing just by calling random functions...

Also, is clustering actually what you are looking for? Most of the time with binary data e.g. frequent itemset mining is the better choice.

Upvotes: 3

Related Questions