Reputation: 33
I have a dataframe with the below columns and I want to find a clustering model that would reveil port clusters to take then some strategic actions. Given that I have mainly categorical values and the only numerical are coordinates with which clustering algorithm would be the best way to go? K-Prototypes , DBSCAN? And should I calculate my own dissimilarity matrix with haversine distance for the coords and Gower distance for categorical values? I am a bit confused on this. Thanks!
port_call_id object
voyage_id object
vessel_id object
port_id object
creation_date object
pre_post_event object
port object
longitude float64
latitude float64
is_terminal_required bool
is_berth_required bool
state object
unlocode object
country object
continent object
timezone_id object
voyage_type object
buyer_id object
vessel_type object
vessel_size object
in Red Sea object
Upvotes: 0
Views: 26