user7885981
user7885981

Reputation:

How can I get a specific part of a URL using RegEx?


I am trying to get a part of a file download using RegEx (or other methods). I have pasted below the link that I am trying to parse and put the part I am trying to select in bold.

  https://minecraft.azureedge.net/bin-linux/bedrock-server-1.7.0.13.zip

I have looked around and thought about trying Named Capture Groups, however I couldn't figure it out. I would like to be able to do this in JavaScript/Node.js, even if it requires a module 👻.

Upvotes: 1

Views: 99

Answers (4)

pergy
pergy

Reputation: 5521

You can use node.js default modules to ease the match

URL and path to identify filename, and an easy regexp finally.

const { URL } = require('url')
const path = require('path')

const test = new URL(
  'https://minecraft.azureedge.net/bin-linux/bedrock-server-1.7.0.13.zip'
)
/*
  test.pathname = '/bin-linux/bedrock-server-1.7.0.13.zip'
  path.parse(test.pathname) = { root: '/',
    dir: '/bin-linux',
    base: 'bedrock-server-1.7.0.13.zip',
    ext: '.zip',
    name: 'bedrock-server-1.7.0.13' }
  match = [ '1.7.0.13', index: 15, input: 'bedrock-server-1.7.0.13' ]
*/
const match = path.parse(test.pathname)
  .name
  .match(/[0-9.]*$/)

Upvotes: 1

Dacre Denny
Dacre Denny

Reputation: 30360

Perhaps a regular expression like this is what you need?

var url = 'https://minecraft.azureedge.net/bin-linux9.9.9/bedrock-server-1.7.0.13.zip'

var match = url.match(/(\d+[.\d+]*)(?=\.\w+$)/gi)

console.log( match )

The way this pattern /\d+[.\d+]*\d+/gi works is to basically say that we want a sub string match that:

  1. first contains one or more digit characters, ie \d+
  2. immediately following this, there can be optional groupings of digits and decimal characters, ie [.\d+]
  3. and finally, (?=\.\w+$) requires a file extension like .zip to follow immediately after our matched string

For more information on special characters like + and *, see this documentation. Hope that helps!

Upvotes: 0

Roberto Maldonado
Roberto Maldonado

Reputation: 1595

I'd stick with this:

-(\d+(?:\.\d+)*)(?:\.\w+)$
  • It matches a dash before any numbers
  • The parenthesis will make a capture group
  • Then, \d+ will match from one to any number of digits
  • ?: will make a group but not capture it
  • Inside this group, \.\d+ will match a dot followed by any number of digits
  • The last expression will repeat from zero to any times thanks to *
  • After that, (?:\.\w+)$ will make a group that matches the extension toward the end of the string but not capture it

So, basically, this format would allow you to capture all the numbers that are after the dash and before the extension, be it 1, 1.7, 1.7.0, 1.7.0.13, 1.7.0.13.5 etc. On the match array, at index [0] you will have the entire regex match, and on [1] you will have your captured group, the number you're looking for.

Upvotes: 0

revo
revo

Reputation: 48711

You could use the below regex:

[\d.]+(?=\.\w+$)

This matches dots and digits that are following a file extension. You could also make it more accurate:

\d+(?:\.\d+)*(?=\.\w+$)

Upvotes: 0

Related Questions