Nurul Magfirah
Nurul Magfirah

Reputation: 11

How to get same output between using dtw() + pam() function and tsclust()?

I continually get different results between these two syntaxes below:

Syntax 1

dtw_distances <- matrix(0, nrow = nrow(data_wide), ncol = nrow(data_wide))
rownames(dtw_distances) <- rownames(data_wide)
colnames(dtw_distances) <- rownames(data_wide)
for (i in 1:(nrow(data_wide)-1)) {
  for (j in (i+1):nrow(data_wide)) {
    ts1 <- as.vector(data_wide[i, ])
    ts2 <- as.vector(data_wide[j, ])
    alignment <- dtw(ts1, ts2, step.pattern=symmetric1, dist.method = "manhattan", keep.internals = T)
    dtw_distances[i, j] <- alignment$distance
    dtw_distances[j, i] <- alignment$distance
    
  }
}
ks_kmedoids <- data.frame(k = integer(), sil_kmeans = numeric())
silhouette_scores_kmedoids <- numeric()
for (k in 2:10) {
  kmedoids_result <- pam(dtw_distances, k)
  silhouette_scores_kmedoids[k] <- mean(silhouette(kmedoids_result$clustering, dtw_distances)[, 3])
  
  ks_kmedoids <- rbind(ks_kmedoids, data.frame(k=k, sil_kmedoids = silhouette_scores_kmedoids[k] ))
}
print(ks_kmedoids)

Syntax 2

ks_kmedoids <- data.frame(k = integer(), sil_kmedoids = numeric())
silhouette_scores_kmedoids <- numeric()
for (k in 2:10) {
  kmedoids_result <- tsclust(data_wide, type="partitional", k=k,distance="dtw_basic",dist_args = list(method = "manhattan",step.pattern=symmetric1),centroid = "pam",seed=123)
  silhouette_scores_kmedoids[k] <- mean(silhouette(kmedoids_result@cluster, dtw_distances)[, 3])
  
  ks_kmedoids <- rbind(ks_kmedoids, data.frame(k=k, sil_kmedoids = silhouette_scores_kmedoids[k] ))
}
print(ks_kmedoids)

Since I used the same parameters, such as using Manhattan for local distance and symmteric1, I thought the results would be the same but they weren't.

Notes: data_wide is time series data that has been standardized. The structure: my data

Could anybody tell me what's the problem?

Thank you very much for your help!!

Upvotes: 1

Views: 36

Answers (0)

Related Questions