SImon Haddad
SImon Haddad

Reputation: 842

Yup validation of website using url() very strict

I am trying to validate an input field as a website using

yup.string().url()

But it seems if the protocol is not sent it gives an error, when the website should be flexible to even accept for example stackoverflow.com

Upvotes: 32

Views: 71142

Answers (9)

NavidM
NavidM

Reputation: 1795

The other solution is simply use Yup transform. You can do something like

Yup.string().url().transform((currentValue) => {
            const doesNotStartWithHttp =
              currentValue &&
              !(
                currentValue.startsWith('http://') ||
                currentValue.startsWith('https://')
              );

            if (doesNotStartWithHttp) {
              return `http://${currentValue}`;
            }
            return currentValue;
          })

This would automatically add http to your answer and would pass the validation.

Upvotes: 6

Marlom
Marlom

Reputation: 688

The simplest, and most optimal is to re-use the regex from Yup.

You can create a new method or override the url method.

Here it is an example with Typescript overriding the url with an option to check the protocol as well.

import * as Yup from 'yup';

type UrlOptions = {
  /** @default {false} */
  forceWithProtocol?: boolean;
};

/**
 * Regex got from Yup source-code, changing the protocol to be optional
 * @link {https://github.com/jquense/yup/blob/v1.0.0-beta.8/src/string.ts#L20}
 */
const URL =
  // eslint-disable-next-line no-useless-escape -- no warnings as it's copy/paste.
  /^(((https?):)?\/\/)?(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*@)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|\/|\?)*)?$/i;

// Override the original URL validation
Yup.addMethod(Yup.string, 'url', function (message, options: UrlOptions = {}) {
  let validation = this.matches(URL, message);

  if (!options.forceWithProtocol) {
    return validation;
  }

  return validation.matches(/^(https?:\/\/)/, message);
});

declare module 'yup' {
  interface StringSchema {
    /** @override */
    url(message: string, options?: UrlOptions): this;
  }
}

Upvotes: 0

Waifu_Forever
Waifu_Forever

Reputation: 107

Im editing the @Animesh Singh's regex adding more validations.

First you need to think why are you validating this, will you store it somewhere? What are you trying to avoid? Depending on your answer you might add some custom validations to the regex.

I searched it and there is actually a limit on the length of url and the number of dots in the domain. A domain name can have hyphens too.

https://news.gandi.net/en/2020/08/should-i-put-a-dash-in-my-domain-name/#:~:text=The%20hyphen%2C%20commonly%20known%20as,with%20no%20space%20between%20them.

What is the maximum length of a URL in different browsers?

quoting wikipedia :

The hierarchy of domains descends from the right to the left label in the name; each label to the left specifies a subdivision, or subdomain of the domain to the right. For example: the label example specifies a node example.com as a subdomain of the com domain, and www is a label to create www.example.com, a subdomain of example.com. Each label may contain from 1 to 63 octets. The empty label is reserved for the root node and when fully qualified is expressed as the empty label terminated by a dot. The full domain name may not exceed a total length of 253 ASCII characters in its textual representation.[9] Thus, when using a single character per label, the limit is 127 levels: 127 characters plus 126 dots have a total length of 253. In practice, some domain registries may have shorter limits.

So I will set the length limit to 2048 characters and restraining the domain to have at most 126 single dots succeed by 63 characters. I don't see a way to check the 253 character domain limit without splitting the URL. I am not checking for hyphens here either.

So I am limiting how many (\.[a-zA-Z]{1,63}) the url can have by switching + to {1,5}

const regex = /^(?=.{4,2048}$)((http|https):\/\/)?(www.)?(?!.*(http|https|www.))[a-zA-Z0-9_-]{1,63}(\.[a-zA-Z]{1,63}){1,5}(\/)?.([\w\?[a-zA-Z-_%\/@?]+)*([^\/\w\?[a-zA-Z0-9_-]+=\w+(&[a-zA-Z0-9_]+=\w+)*)?$/;

Yup
 .string()
 .matches(regMatch, "Website should be a valid URL")

In my use case I don't want the main domain, so I always require something after the slash by switching (\/)? to (\/){1}

const regex = /^(?=.{4,2048}$)((http|https):\/\/)?(www.)?(?!.*(http|https|www.))[a-zA-Z0-9_-]{1,63}(\.[a-zA-Z]{1,63}){1,5}(\/){1}.([\w\?[a-zA-Z-_%\/@?]+)*([^\/\w\?[a-zA-Z0-9_-]+=\w+(&[a-zA-Z0-9_]+=\w+)*)?$/;

Yup
 .string()
 .matches(regMatch, "Website should be a valid URL")

I ask everyone to test this regex, and let me know if there is any unexpected behavior.

Upvotes: 1

const isValidUrl = (url) => {
    try {
        new URL(url);
    } catch (e) {
        return false;
    }
    return true;
};

const FormSchema = Yup.object({
    url: Yup.string().test('is-url-valid', 'URL is not valid', (value) => isValidUrl(value)),
});

https://dev.to/calvinpak/simple-url-validation-with-javascript-4oj5

Upvotes: 4

Animesh Singh
Animesh Singh

Reputation: 9282

Adding some more validations to the @trash_dev's regex,

you could try https://regex101.com/r/V5Y7rn/1/

const regMatch = /^((http|https):\/\/)?(www.)?(?!.*(http|https|www.))[a-zA-Z0-9_-]+(\.[a-zA-Z]+)+(\/)?.([\w\?[a-zA-Z-_%\/@?]+)*([^\/\w\?[a-zA-Z0-9_-]+=\w+(&[a-zA-Z0-9_]+=\w+)*)?$/;

Yup
 .string()
 .matches(regMatch, "Website should be a valid URL")

It also considered extra considerations for URL such as:

www.test-my-skills.gov.cz/0999asd-xzc88?0-_/sad%20123/@asdas
asdasd.com/asdasd/asdasd/asdasd/@asasd
https://www.somehow.com/@aasd
https://www.test.facebook.com/@sdas
http://www.computer.com.au/

Upvotes: 6

onlit
onlit

Reputation: 778

All the answers treat www.mywebsite as valid. Which should not be the case.

const re = /^((ftp|http|https):\/\/)?(www.)?(?!.*(ftp|http|https|www.))[a-zA-Z0-9_-]+(\.[a-zA-Z]+)+((\/)[\w#]+)*(\/\w+\?[a-zA-Z0-9_]+=\w+(&[a-zA-Z0-9_]+=\w+)*)?$/gm

Yup.string().matches(re,'URL is not valid')

matches:

  • vercel.com
  • www.vercel.com
  • uptime-monitor-fe.vercel.app
  • https://uptime-monitor-fe.vercel.app/

Upvotes: 18

Justin.Mathew
Justin.Mathew

Reputation: 481

A lot of urls break on the verified answer. Something closer to Yup.url() but allowing the omission of http, www. and // would be:

const URL = /^((https?|ftp):\/\/)?(www.)?(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*@)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|\/|\?)*)?$/i

Yup.string().matches(URL, 'Enter a valid url')

Upvotes: 19

Cmte Cardeal
Cmte Cardeal

Reputation: 173

Just completing @aturan23, you can add a - inside [a-z0-9] and [a-zA-Z0-9#], like this:

((https?):\/\/)?(www.)?[a-z0-9-]+(\.[a-z]{2,}){1,3}(#?\/?[a-zA-Z0-9#-]+)*\/?(\?[a-zA-Z0-9-_]+=[a-zA-Z0-9-%]+&?)?$

You can validate url like this:

  • material-ui.com

  • https://github.com/mui-org/material-ui

  • http://github.com/mui-org/material-ui

  • github.com/mui-org/material-ui/core#teste

Upvotes: 6

aturan23
aturan23

Reputation: 5400

Instead of using default url validator you can use your own regex. Your code changes like:

website: Yup.string()
        .matches(
            /((https?):\/\/)?(www.)?[a-z0-9]+(\.[a-z]{2,}){1,3}(#?\/?[a-zA-Z0-9#]+)*\/?(\?[a-zA-Z0-9-_]+=[a-zA-Z0-9-%]+&?)?$/,
            'Enter correct url!'
        )
        .required('Please enter website'),

You can use your own rule for regex and validate url. You can read more about it there.

Play around with it here: https://regex101.com/r/O47zyn/4

Upvotes: 53

Related Questions