How can I calculate the distance of a state within a cluster from the center of the cluster?

Question

I have a sample of 28 states. I want to plot them in one cluster, identify the center, and then calculate the distance of every state from the center, per year.

my input file resemble the following: first column: Country second column: Year (from 2008 to 2017) third column: PI (index)

Question 1: I am getting the error: Error in eval(e, x, parent.frame()) : object 'mydata.year' not found when I run: table_2008 = subset(table1, mydata.year ==2008)

Question 2: Which code is best suited to calculate the distance of a state from the center of the cluster.

Please find my code below. I hope someone can help.

Thank you.

Code:

heisenberg <- read.csv(file="C:/Users/TA/Desktop/R4./PI4.csv",head=TRUE,sep=",") rm(list=ls())

mydata = read.csv("C:/Users/TA/Desktop/R4./PI4.csv",sep = ",", header=TRUE)

mydata$Country
mydata$Category
mydata$PI

data_cluster = data.frame(mydata$Country,mydata$Category,mydata$PI)

write.csv(data_cluster,"C:/Users/TA/Desktop/R4./OutputPI.csv", row.names = FALSE)


table1 = data_cluster



#plot(uk_line[,4])
table1 = na.omit(table1)

within_results = ts(,start = c(2008), end = c(2017), frequency = 1)
within_resultsbetweenss = ts(,start = c(2008), end = c(2017), frequency = 1)
within_results_withinss = matrix(data= NA, nrow = 10, ncol = 4) 
#nrow = years, ncols = number of clusters

#colnames(mydata, c("Country","Year"))

#YEAR 2008
#SELECTING A GIVEN YEAR (subset of rows such that year = 2008)
table_2008 = subset(table1, mydata.year ==2008)
table_2008


data2008_clus = table_2008[,3:ncol(table_2008)]

#NAMING THE ROWS USING THE COUNTRY NAMES
rownames(data2008_clus) = table_2008$mydata.Country

data2008_clus


plot(table_2008)

wss <- (nrow(data2008_clus)-1)*sum(apply(data2008_clus,2,var))
for (i in 2:15) wss[i] <- sum(kmeans(data2008_clus,
                                     centers=i)$withinss)
plot(1:15, wss, type="b", xlab="Number of Clusters",
     ylab="Within groups sum of squares")




# Compute k-means with k = 1


fit1=kmeans(x = data2008_clus,centers = 1)
fit1$cluster
fviz_cluster(fit1,data = data2008_clus)
fit1$withinss
fit1$totss
fit1$betweenss
table_2008$cluster = factor(fit1$cluster)
centers=as.data.frame(fit1$centers)
table_2008

within_results[1] = fit1$totss
within_resultsbetweenss[1] = fit1$betweenss
within_results_withinss[1,] = fit1$withinss
within_results_withinss[1,] =  fit1$withinss

plot(within_results)
plot(within_resultsbetweenss)
plot(within_results_withinss)

# Print the results 
print(km.res)
table_2008

mydata_struct = structure( list( Year = c(2008L, 2008L, 2008L, 2008L, 2008L, 2008L), Country = structure( 1:6, .Label = c( "Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus", "Czechia", "Denmark", "Estonia", "Finland", "France", "Germany", "Greece", "Hungary", "Ireland", "Italy", "Latvia", "Lithuania", "Luxembourg", "Malta", "Netherlands", "Poland", "Portugal", "Romania", "Slovakia", "Slovenia", "Spain", "Sweden", "United Kingdom" ), class = "factor" ), Prosperity.Index = c(79.4, 76.1, 62, 65.1, 69.9, 70.9) ), row.names = c(NA, 6L), class = "data.frame" )

How can I calculate the distance of a state within a cluster from the center of the cluster?

Answers (1)

Related Questions