jwesonga
jwesonga

Reputation: 4383

Compare two slices

Is there a way in Go to compare two slices and get the elements in slice X that are not in slice Y and vice versa?

    X := []int{10, 12, 12, 12, 13}
    Y := []int{12, 14, 15}

func compare(X, Y []int)  

calling compare(X, Y)   
    result1 := []int{10, 12, 12, 13} // if you're looking for elements in slice X that are not in slice Y

calling compare(Y, X)
    result2 := []int{14, 15} // if you're looking for elements in slice Y that are not in slice X

Upvotes: 15

Views: 27432

Answers (6)

blackgreen
blackgreen

Reputation: 44645

If you don't want to reinvent the wheel (and then maintain it) with custom algos, you can use github.com/r3labs/diff/v2.

The diff returns a changelog that you can inspect to find exactly what changed, in the form of create/update/delete entries. The changelog can also be marshalled to JSON:

package main

import (
    "encoding/json"
    "github.com/r3labs/diff/v2"
    "os"
)

func main() {
    x := []int{10, 12, 12, 12, 13}
    y := []int{12, 14, 15}
    ch, _ := diff.Diff(x, y)
    json.NewEncoder(os.Stdout).Encode(ch)
}

It prints the changelog:

[{"type":"delete","path":["0"],"from":10,"to":null},{"type":"update","path":["2"],"from":12,"to":15},{"type":"delete","path":["3"],"from":12,"to":null},{"type":"delete","path":["4"],"from":13,"to":null},{"type":"create","path":["1"],"from":null,"to":14}]

The changelog essentially describes the operations necessary to obtain the second slice from the first.

The library also provides patch-like functionality. You can pass the changelog right into diff.Patch and apply the changes to the target value:

    diff.Patch(ch, &x)
    fmt.Println(x) // [12 14 15] just like the 'y' slice

Note that the default behavior of this differ is to ignore slice element order. So if you diff []int{12, 14, 15} and []int{12, 15, 14} it will produce an empty changelog.

To account for item ordering, use the option diff.SliceOrdering(true):

func main() {
    x := []int{12, 15, 14}
    y := []int{12, 14, 15}
    ch, _ := diff.Diff(x, y, diff.SliceOrdering(true))
    json.NewEncoder(os.Stdout).Encode(ch)
    // [{"type":"update","path":["1"],"from":15,"to":14},{"type":"update","path":["2"],"from":14,"to":15}]

}

Upvotes: 0

Tuan LE CONG
Tuan LE CONG

Reputation: 387

At the moment I answer this question, go version is 1.17.3

We can use a built-in package: reflect

The package provides a very useful function DeepEqual that fits your requirement

reflect.DeepEqual(X, Y)

Upvotes: -1

Laremere
Laremere

Reputation: 41

All of the solutions provided fail to precisely answer the question asked. Instead of the differences in the slices, the solutions provide the differences of the sets of elements in the slices.

Specifically, instead of the intended example of:

    X := []int{10, 12, 12, 12, 13}
    Y := []int{12, 14, 15}

func compare(X, Y []int)  

calling compare(X, Y)   
    result1 := []int{10, 12, 12, 13} // if you're looking for elements in slice X that are not in slice Y

calling compare(Y, X)
    result2 := []int{14, 15}

The provided solutions will result in:

result1 := []int{10,13}
result2 := []int{14,15}

To yield strictly the example result, a different method is required. Here are two solutions:

If the slices are already sorted:

This solution may be faster if you sort your slices, and then call compare. It'll definitely be faster if your slices are already sorted.

func compare(X, Y []int) []int {
    difference := make([]int, 0)
    var i, j int
    for i < len(X) && j < len(Y) {
        if X[i] < Y[j] {
            difference = append(difference, X[i])
            i++
        } else if X[i] > Y[j] {
            j++
        } else { //X[i] == Y[j]
            j++
            i++
        }
    }
    if i < len(X) { //All remaining in X are greater than Y, just copy over
        finalLength := len(X) - i + len(difference)
        if finalLength > cap(difference) {
            newDifference := make([]int, finalLength)
            copy(newDifference, difference)
            copy(newDifference[len(difference):], X[i:])
            difference = newDifference
        } else {
            differenceLen := len(difference)
            difference = difference[:finalLength]
            copy(difference[differenceLen:], X[i:])
        }
    }
    return difference
}

Go Playground version

Unsorted Version using maps

func compareMapAlternate(X, Y []int) []int {
    counts := make(map[int]int)
    var total int
    for _, val := range X {
        counts[val] += 1
        total += 1
    }
    for _, val := range Y {
        if count := counts[val]; count > 0 {
            counts[val] -= 1
            total -= 1
        }
    }
    difference := make([]int, total)
    i := 0
    for val, count := range counts {
        for j := 0; j < count; j++ {
            difference[i] = val
            i++
        }
    }
    return difference
}

Go Playground version

Edit: I've created a benchmark for testing my two version (with the map having a slight modification where it deletes zero values from the map). It won't run on Go Playground because Time doesn't work properly on it, so I ran it on my own computer.

compareSort sorts the slice and calls the iterated version of compare, and compareSorted runs afters compareSort but relies on the slice already being sorted.

Case: len(X)== 10000 && len(Y)== 10000
--compareMap time: 4.0024ms
--compareMapAlternate time: 3.0225ms
--compareSort time: 4.9846ms
--compareSorted time: 1ms
--Result length == 6754 6754 6754 6754
Case: len(X)== 1000000 && len(Y)== 1000000
--compareMap time: 378.2492ms
--compareMapAlternate time: 387.2955ms
--compareSort time: 816.5619ms
--compareSorted time: 28.0432ms
--Result length == 673505 673505 673505 673505
Case: len(X)== 10000 && len(Y)== 1000000
--compareMap time: 35.0269ms
--compareMapAlternate time: 43.0492ms
--compareSort time: 385.2629ms
--compareSorted time: 3.0242ms
--Result length == 3747 3747 3747 3747
Case: len(X)== 1000000 && len(Y)== 10000
--compareMap time: 247.1561ms
--compareMapAlternate time: 240.1727ms
--compareSort time: 400.2875ms
--compareSorted time: 17.0311ms
--Result length == 993778 993778 993778 993778

As you can see, if the array is sorted not using maps is much faster, but using maps is faster than sorting it and then using the iterated approach. For small cases sorting may be quick enough that it should be used, but the benchmark would finish to quickly to be timed.

Upvotes: 4

ctn
ctn

Reputation: 2930

Something like this should work:

package main

import "fmt"

func main() {
    X := []int{10, 12, 12, 12, 13}
    Y := []int{12, 14, 15}

    fmt.Println(compare(X, Y))
    fmt.Println(compare(Y, X))
}

func compare(X, Y []int) []int {
    m := make(map[int]int)

    for _, y := range Y {
        m[y]++
    }

    var ret []int
    for _, x := range X {
        if m[x] > 0 {
            m[x]--
            continue
        }
        ret = append(ret, x)
    }

    return ret
}

http://play.golang.org/p/4DujR2staI

Upvotes: 3

Ilia Choly
Ilia Choly

Reputation: 18557

Dragging in a set implementation might be overkill. This should be enough:

func diff(X, Y []int) []int {

  diff := []int{}
  vals := map[int]struct{}{}

  for _, x := range X {
    vals[x] = struct{}{}
  }

  for _, x := range Y {
    if _, ok := vals[x]; !ok {
      diff = append(diff, x)
    }
  }

  return diff
}

A bloom filter can be added if the slices get really big.

Upvotes: -1

Not_a_Golfer
Not_a_Golfer

Reputation: 49187

If order is not important, and the sets are large, you should use a set implementation, and use its diff function to compare them.

Sets are not part of the standard library, but you can use this library for example, you can initialize a set automatically from a slice with it. https://github.com/deckarep/golang-set

Something like this:

import (
    set "github.com/deckarep/golang-set"
    "fmt"
    )

func main() {
    //note that the set accepts []interface{}
    X := []interface{}{10, 12, 12, 12, 13}
    Y := []interface{}{12, 14, 15}

    Sx := set.NewSetFromSlice(X)
    Sy := set.NewSetFromSlice(Y)
    result1 := Sx.Difference(Sy)
    result2 := Sy.Difference(Sx)

    fmt.Println(result1)
    fmt.Println(result2)
}

Upvotes: 8

Related Questions