pan325
pan325

Reputation: 33

Clustering model selection

I have a dataframe with the below columns and I want to find a clustering model that would reveil port clusters to take then some strategic actions. Given that I have mainly categorical values and the only numerical are coordinates with which clustering algorithm would be the best way to go? K-Prototypes , DBSCAN? And should I calculate my own dissimilarity matrix with haversine distance for the coords and Gower distance for categorical values? I am a bit confused on this. Thanks!

port_call_id object
voyage_id object
vessel_id object
port_id object
creation_date object
pre_post_event object
port object
longitude float64
latitude float64
is_terminal_required bool
is_berth_required bool
state object
unlocode object
country object
continent object
timezone_id object
voyage_type object
buyer_id object
vessel_type object
vessel_size object
in Red Sea object

Upvotes: 0

Views: 26

Answers (0)

Related Questions