kinemat1c
kinemat1c

Reputation: 57

Assistance with Regex matching

I am a regex beginner and I have been practicing by going through a problem on this website. I am given the following text:

Fedora Core         ftp     
Fedora Extras   http    ftp     rsync
          ftp://ftp7.br.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp3.de.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp.is.FreeBSD.org/pub/FreeBSD/ (ftp / rsync)
          ftp://ftp4.jp.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp.no.FreeBSD.org/pub/FreeBSD/ (ftp / rsync)
        *
          ftp://ftp3.no.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp.pt.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp1.ro.FreeBSD.org/pub/FreeBSD/ (ftp / ftpv6)
          ftp://ftp3.es.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp2.tw.FreeBSD.org/pub/FreeBSD/ (ftp / ftpv6 / http / httpv6 / rsync / rsyncv6)
          ftp://ftp6.uk.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp6.us.FreeBSD.org/pub/FreeBSD/ (ftp / http)
sunsite.informatik.rwth-aachen.de       [ftp]   [http]  Rheinisch-Westfälische Technische Hochschule Aachen
lame.lut.fi         [http]  Computer Club Ruut (Finland)
    1 Gbits/sec     IPv4 and IPv6
FR  Fedora Mirror   ftp.proxad.net  
US  distro.ibiblio.org  jungle.metalab.unc.edu  
Fedora Core         ftp     
          ftp://ftp.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp11.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp14.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp.ar.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp3.au.FreeBSD.org/pub/FreeBSD/ (ftp)
    In case of problems, please contact the hostmaster <[email protected]> for this domain.
          ftp://ftp4.br.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp.hr.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp.cz.FreeBSD.org/pub/FreeBSD/ (ftp / http / rsync)
          ftp://ftp.il.FreeBSD.org/pub/FreeBSD/ (ftp / ftpv6)
          ftp://ftp7.jp.FreeBSD.org/pub/FreeBSD/ (ftp)
        *
          ftp://ftp7.ua.FreeBSD.org/pub/FreeBSD/ (ftp)
          ftp://ftp11.ua.FreeBSD.org/pub/FreeBSD/ (ftp)

I need to extract all ftp addresses, so lines starting with ftp and ending with FreeBSD. I have been able to extract some, with this regex:

ftp://ftp\d\d?.\w\w.FreeBSD.org/pub/FreeBSD/

But many do not extract, e.g. ftp://ftp14.FreeBSD.org/pub/FreeBSD/ . There is no answers, please let me know what my expression is missing so I can improve. Thank you.

Upvotes: 2

Views: 98

Answers (3)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

It seems you are trying to extract all urls with the domain: "FreeBSD.org" following with the path: "/pub/FreeBSD/".

I suggest:

 \bftp://[A-Za-z0-9.]*\bFreeBSD\.org/pub/FreeBSD/

Note that the dot needs to be escaped outside a character class but not inside.

Upvotes: 3

Emma
Emma

Reputation: 27723

This expression might simply extract those desired FTPs:

ftp://\S*/FreeBSD/

If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


Upvotes: 2

Albert Nikolaevich
Albert Nikolaevich

Reputation: 31

Look at this:

ftp://ftp(\d{0,2}.\w{0,2})?.FreeBSD.org/pub/FreeBSD/

Think what is constant and what changes in your ftp addresses. Beginning is always same. Then you can have 0-2 digits after ftp, followed by a dot, optionally followed by a two-letter (country code?)(so make it optional). And then you have one at least where you have no country code and no numbers after ftp. So just make it optional (using ?). The rest is always constant, i.e. .FreeBSD.org/pub/FreeBSD/. Hope this helps.

Upvotes: 3

Related Questions