Reputation: 139
I have an array of strings, I need to compare this to another array of strings, but they may be in a different order. What's the best way to compare the two arrays?
This is what I have so far, just wondering if there is a simpler / more efficient way I'm missing.
func unorderedEqual(first, second []string) bool {
if len(first) != len(second) {
return false
}
for _, value := range first {
if !contains(value, second) {
return false
}
}
return true
}
func contains(needle string, haystack []string) bool {
for _, matchValue := range haystack {
if matchValue == needle {
return true
}
}
return false
}
Upvotes: 3
Views: 12695
Reputation: 101
Ideally, this would have been a comment, but the TLDR is it is both more correct and faster to use Husain's sort and compare.
Details.
For anyone looking at the RayfenWindspear answer above (currently highest ranked), at first glance it looks right, but be aware that it does not check for exact equal-ness, but rather only that every element in the second list is in the first list. The converse also needs to be true, but is not checked by this method. Hence it fails this test (bb
is repeated).:
assert.False(t, unorderedEqual([]string{"aa", "cc", "bb"}, []string{"aa", "bb", "bb"}))
Of course you can just run it twice to get the correct result, which is only a linear factor
func DoubleUnorderedEqual(a, b []string) bool {
return unorderedEqual(a, b) && unorderedEqual(b, a)
}
The suggestion for the sort-then-check by Husain should possibly be ranked higher, because it is correct, and is also faster for larger lists.
Here's Husain's code in an exported function:
func SortCompare(a, b []string) bool {
if len(a) != len(b) {
return false
}
sort.Strings(a)
sort.Strings(b)
for i, v := range a {
if v != b[i] {
return false
}
}
return true
}
And some tests you can run on it (which it passes)
func TestSortCompare(t *testing.T) {
assert.True(t, SortCompare([]string{"aa"}, []string{"aa"}))
assert.False(t, SortCompare([]string{"aa"}, []string{"bb"}))
assert.False(t, SortCompare([]string{"aa"}, []string{"bb", "cc"}))
assert.True(t, SortCompare([]string{"cc", "bb"}, []string{"bb", "cc"}))
assert.True(t, SortCompare([]string{"aa", "cc", "bb"}, []string{"aa", "bb", "cc"}))
assert.False(t, SortCompare([]string{"aa", "cc", "bb"}, []string{"aa", "bb", "bb"}))
}
And here is some sample benchmarking ....
func getStrings() ([]string, []string) {
a := []string{"a", "foo", "bar", "ping", "pong"}
b := []string{"pong", "foo", "a", "bar", "ping"}
return a, b
}
func BenchmarkSortCompare(b *testing.B) {
s0, s1 := getStrings()
var outcome bool
for n := 0; n < b.N; n++ {
outcome = SortCompare(s0, s1)
}
fmt.Println(outcome)
}
func BenchmarkDoubleUnorderedEqual(b *testing.B) {
s0, s1 := getStrings()
var outcome bool
for n := 0; n < b.N; n++ {
outcome = DoubleUnorderedEqual(s0, s1)
}
fmt.Println(outcome)
}
With the results ...
BenchmarkSortCompare-32 2637952 498 ns/op
BenchmarkDoubleUnorderedEqual-32 3060261 381 ns/op
So running the map method twice is slightly faster at this small size... but add a few more strings and the sort method wins out by a factor of 10. I did not take into account the impact of the degree of disorder in the strings but they are sufficiently disordered it is not an obviously unfair test at first glance.
func getStrings2() ([]string, []string) {
a := []string{"a", "foo", "bar", "ping", "pong", "b", "c", "g", "e", "f", "d", "h", "i", "j", "q", "l", "n", "o", "p", "k", "r", "s", "t", "u", "v", "w", "x", "y", "z"}
b := []string{"pong", "foo", "a", "bar", "ping", "p", "r", "q", "s", "u", "t", "v", "j", "x", "y", "z", "b", "e", "d", "c", "h", "g", "f", "i", "w", "k", "l", "n", "o"}
return a, b
}
func BenchmarkSortCompare2(b *testing.B) {
s0, s1 := getStrings2()
var outcome bool
for n := 0; n < b.N; n++ {
outcome = SortCompare(s0, s1)
}
fmt.Println(outcome)
}
func BenchmarkDoubleUnorderedEqual2(b *testing.B) {
s0, s1 := getStrings2()
var outcome bool
for n := 0; n < b.N; n++ {
outcome = DoubleUnorderedEqual(s0, s1)
}
fmt.Println(outcome)
}
The results:
BenchmarkSortCompare2-32 454641 2797 ns/op
BenchmarkDoubleUnorderedEqual2-32 44420 26714 ns/op
Conclusion - I'll be using Husain's sort and compare ....
Upvotes: 5
Reputation: 6274
Given that you are doing a length check, I'm going to go with the assumption that implies that they are 1:1, just ordered differently.
You can do this in one pass (each) using a map[string]bool
to check existence in both. This utilizes the fact that the map
returns the zero value of a bool
, which is false
, when the key is not present.
Disclaimer: Technically this is order O(n)*O(map). The Go Programming Language Specification does not make any performance guarantees for map types.
https://play.golang.org/p/2LUjN5LkXLL
func unorderedEqual(first, second []string) bool {
if len(first) != len(second) {
return false
}
exists := make(map[string]bool)
for _, value := range first {
exists[value] = true
}
for _, value := range second {
if !exists[value] {
return false
}
}
return true
}
If you want to get nit-picky about memory usage, you could save yourself storing a bunch of bool
s (which is usually negligible, but to each their own) by using a map[string]struct{}
(the empty struct), and you just check existence a little differently, as in this example.
https://play.golang.org/p/MjwP_wCtYZV
Set
exists[value] = struct{}{}
Check
if _, ok := exists[value]; !ok {
return false
}
Upvotes: 7
Reputation: 1213
Generic, language agnostic:
I don't know GO, but you seem to be naively searching for each element from A in B. In worst case scenario you get many many iterations over B. Sorting with performant algorithm seems to be way more efficient even though it's additional operation.
I unfortunately will not provide code sample, as I don't know GO.
Upvotes: 1
Reputation: 845
You can sort the arrays first and then check by index:
package main
import (
"fmt"
"sort"
)
func main() {
s1 := []string{"first", "second", "third"}
s2 := []string{"third", "first", "second"}
if len(s1) != len(s2) {
fmt.Println("not equal")
}
sort.Strings(s1)
sort.Strings(s2)
for i := 0; i < len(s1); i++ {
if s1[i] != s2[i] {
fmt.Println("not equal")
return
}
}
fmt.Println("equal")
}
Or in playground.
The idea with sorting is that it makes the comparison easier and faster. Sorting and then comparing index-wise is O(n log n), while 2-loop checking takes O(n^2).
Upvotes: 1
Reputation:
Using this dedicated package, you would have to use []interface{}
instead of []string
then proceed as such
package main
import (
"fmt"
"github.com/deckarep/golang-set"
)
func main() {
some := []interface{}{"a", "b", "c"}
other := []interface{}{"a", "b", "d"}
fmt.Println(
mapset.NewThreadUnsafeSetFromSlice(some).
Difference(mapset.NewThreadUnsafeSetFromSlice(other)).Cardinality() == 0,
)
// Output: false
}
Upvotes: 0