ulrich_k
ulrich_k

Reputation: 11

R: Implement while loop to stop when results don't change value anymore

I'm currently working on an R script for the c-mean clustering method. I started with a rather simple version, to get the basic structure done. The idea is to cluster the values into n classes.

I have a vector of 8 values and I pick two to be my first candidates.

values <- c(4,8,12,5,9,30,75,13)
candidates <- c(values[1],values[6])

Then the elements of "values" shall be sorted for their distance from the candidates. I'm not sure if my version is the most elegant one but it seems to be working.

If the distance from one element to candidate one is smaller, it shall be sorted in group.1, and the other way around. In each case, the group that the value is not a part of, gets an NA.

After going through all the elements of "values" the mean value of each group is calculated and the process is repeated. In this case 10 times, because I added the loop.

The idea is, in the end you get the same values over and over again. Those values are the centers of the cluster.

group.2 <- 0
group.1 <- 0
for(j in 1:10){
    for(i in 1:length(values)){
    if( abs(candidates[1]-values[i]) < abs(candidates[2]-values[i]) ){
    group.2[i] <- -999
    group.1[i] <- values[i]
} else if( abs(candidates[1]-values[i]) > abs(candidates[2]-values[i]) ) {
group.1[i] <- -999
group.2[i] <- values[i]
} 
}
group.1 <- group.1[!group.1==-999]
group.2 <- group.2[!group.2==-999]

candidates<- c(mean(group.1), mean(group.2))
print(candidates)
}

If you look at the output, you'll see that you actually get the final centers of the clusters after the second repetition.

What I can't figure out is how to make the loop stop, as soon as the results aren't changing anymore.

My idea is to add another loop which terminates the process as soon as

candidates[j]==candidates[j-1]

however I can't figure out how to access the previous value j-1 of the loop.

Upvotes: 0

Views: 804

Answers (2)

Roland
Roland

Reputation: 132706

Better use vectorization and write a function:

values <- c(4,8,12,5,9,30,75,13)
candidates <- c(values[1],values[6])

cmeans <- function(values, candidates, maxiter=10, tol = .Machine$double.eps ^ 0.5, verbose=TRUE) {

  for (j in seq_len(maxiter)) {
    divide <- abs(candidates[1]-values) <= abs(candidates[2]-values)
    group.1 <- values[divide]
    group.2 <- values[!divide]
    candidates.new<- c(mean(group.1), mean(group.2))
    if (min(abs(candidates.new-candidates)) < tol) {
      return(candidates.new)
    } else {
      if (verbose) message(paste(candidates.new, collapse=", "))  
      candidates <- candidates.new
    }
  }
}

cmeans(values, candidates)
#8.5, 52.5
#11.5714285714286, 75
#[1] 11.57143 75.00000

Upvotes: 1

Christopher Louden
Christopher Louden

Reputation: 7592

You will need to create a new variable, say old.candidates at the beginning of the loop that is set equal to candidates. Then, after setting candidates, check equality and break if they are equal.

candidates <- 0  # You have to initialize it here
for(j in 1:10){
  old <- candidates
  # Do stuff
  candidates <- c(mean(group.1), mean(group.2))
  if(old - candidates == 0) break()
}

A better way would be to check if abs(old - candidates) < tol for some small value of tol.

Upvotes: 2

Related Questions