Ajey
Ajey

Reputation: 8212

Golang match domain names wild card

I have hostnameWhitelist map

var hostnameWhitelist = map[string] bool { "test.mydomain.com":true, "test12.mydomaindev.com":true}

And the way I check if incoming request's hostname is allowed or not is -

    url, errURL := url.Parse("test.mydomain.com")
    if errURL != nil {
        fmt.Println("error during parsing URL")
        return false
    }
    fmt.Println("HOSTNAME = " + url.Hostname())

    if ok := hostnameWhitelist[url.Hostname()]; ok {
        fmt.Println("valid domain, allow access")
    } else {
        fmt.Println("NOT valid domain")
        return false
    }

While this works great, how do I do a wild card match like -

*.mydomain.com 
*.mydomaindev.com 

Both of these should pass.

While,

*.test.com
*.hello.com

should fail

Upvotes: 2

Views: 12293

Answers (5)

Zombo
Zombo

Reputation: 1

You can use fstest.MapFS like a Set data structure, with the added benefit of Glob matching:

package main
import "testing/fstest"

var tests = []struct {
   pat string
   res int
} {
   {"*.hello.com", 0},
   {"*.mydomain.com", 1},
   {"*.mydomaindev.com", 1},
   {"*.test.com", 0},
}

func main() {
   m := fstest.MapFS{"test.mydomain.com": nil, "test12.mydomaindev.com": nil}
   for _, test := range tests {
      a, e := m.Glob(test.pat)
      if e != nil {
         panic(e)
      }
      if len(a) != test.res {
         panic(len(a))
      }
   }
}

https://golang.org/pkg/testing/fstest

Upvotes: 1

Kjell Tore Fossbakk
Kjell Tore Fossbakk

Reputation: 51

Doing a wildcard match with the wildcard at the start is highly expensive. Regex could be difficult with regards to performance, depending on the size of your dataset and the speed of evaluating against your dataset. You could try using a suffix tree, but I suspect the performance could become a problem (I havent tested it on our data).

One approach we use is building a Radix Trie (compact prefix trie) with the signature domainname's labels in reverse octet order. Your signature domain *.foo.example.com becomes com.example.foo.*, which puts the wildcard at the end. Your custom built Radix tree will then only need to stop matching if it reaches a wildcard node. Your Trie could support both exact string matching and wildcard matching. If you wish to allow the wildcard to sit in the middle of the domainname the performance could become a problem.

One of the biggest challenges we'v had using Trie's to evaluate domainnames is not the searchtime but the memory consumption and as such how long it takes to start the program when you have a lot of signatures.

We'v evaluated a few implementations (at start mainly without wildcard-support) testing loadtime, allocations, # of internal nodes, memoryconsumption, GC time and search/insert/remove time.

Implementations we'v tested:

Obviously, using a golang map will give best performance, but when one needs to retrieve (whence the word Trie) e.g. prefixed information from the dataset, golang maps doesn't give us the features we need.

We keep an approximately 700 000 domainname signatures in our Trie. Buildtime is 2 seconds, 300MB memory, 5 million allocation, 2second GC and searching costs 150ns/op.

If we use golang map for the same signatures (without wildcards) we get loadtime 0.5seconds, 50MB memory, negligible allocations, 1.6second GC and searching costs 25ns/op.

In our initial implementation buildtime was 6seconds, 1GB memory, 60 million allocations, 5second GC and searching cost ~200 ns/op.

As you can see from these results we managed to lower the memory consumption and loadtime, while the searching cost remained approximately the same.

If your going to do CIDR matching, I would recommend checking out https://github.com/kentik/patricia. To lower the GC time it is implemented to avoid pointers.

Good luck with your work!

Upvotes: 5

Xavier Nicollet
Xavier Nicollet

Reputation: 353

If you want to be able to have several depth in your domains, eg:

  • *.foo.example.org
  • *.example.com

Then I would add a second container for the wildcards:

var wdomains = []string { ".foo.example.org", ".example.com"}

Then just check if your domain to test ends with one of those entries:

func inWdomain(wdomains []string, domain string) bool {
    for _, suffix := range wdomains {
        if strings.HasSuffix(domain, suffix) {
            return true
        }
    }
    return false
}

Note: if you have more than hundreds of domains, you could use a radix tree.

https://play.golang.org/p/-4n8mlGmpH

Upvotes: 1

John S Perayil
John S Perayil

Reputation: 6345

You can store the keys of the map in the format *.domain.com

The convert all the hostnames you get into that format using strings.SplitAfterN and strings.Join.

split := strings.SplitAfterN(url.Hostname(),".",2)
split[0] = "*"
hostName := strings.Join(split,".")
...
hostnameWhitelist[hostName]
...

Play Link

Unrelated improvement

If you are using the map purely as a whitelist you can use map[string]struct{} instead of map[string]bool. But as Peter mentioned in his comment, it might be relevant only if you have a very large whitelist.

Upvotes: 2

David
David

Reputation: 947

Regex is the to go solution for your problem, map[string]bool may not work as expected as you are trying to match a regex with single value.

package main

import (
    "fmt"
    "regexp"
)

func main() {
    if matched, _ := regexp.MatchString(".*\\.?mydomain.*", "mydomaindev.com"); matched {
        fmt.Println("Valid domain")
    }
}

This would match all domain with pattern mydomain, so www.mydomain.com www.mydomaindev.com would match byt test.com and hello.com will fail

Other handy string ops are,

//This would match a.mydomain.com, b.mydomain.com etc.,
if strings.HasSuffix(url1.Hostname(), ".mydomain.com") {
    fmt.Println("Valid domain allow access")
}

//To match anything with mydomain - like mydomain.com, mydomaindev.com
if strings.Contains(url2.Hostname(), "mydomain.com") {
    fmt.Println("Valid domain allow access")
}

Upvotes: 3

Related Questions