Reputation: 735
I'm implementing some blocking of hosts in my web extension, at this point I'm matching ["http://*/*", "https://*/*"]
and then using a regex function to filter down the URLs but I want to narrow down the match patterns instead of using regex matching on every single request, is this possible?
Upvotes: 2
Views: 680
Reputation: 33296
webRequest.RequestFilter
MDN URL specifiers are Match PatternsMDN. Match Patterns do not have the capability to wildcard Top Level Domains. They should not have this capability. Using a wildcard for the TLD is inherently insecure. You should not be trying to do so. There is no way to guaranty that whatever company/site you are trying to cover will obtain every single version of the name for every single TLD.
If you have some company which you are wanting to cover that has multiple TLDs, you should determine a list of the domains which they own in each TLDs and specify those individually. Yes, it might be less text to specify it with a regular expression that provides a limited set of TLDs.
For example, if regular expressions were permitted, then https://example.com, https://example.org/, and https://example.edu/ could have been something like /https?:\/\/(?:[^.\/]*\.)*example\.(?:com|edu|org)\//
, but Match Patterns don't have regular expressions. The limited wildcard which they have *
represents anything, which if used in place of a TLD would match any TLD, or even a domain and subdomain. So in the end, you will need to list each domain, including the TLD, separately.
As you've determined, you other alternative is to use <all_urls>
(or similar) and then filter within your handler. Doing so is sub-optimum. If you do so, you should work at making the listener fast. It should not perform any extraneous operations, at least until it's determined if the URL is one it will, or will not process further.
Upvotes: 1