Joey Coder
Joey Coder

Reputation: 3489

Extract text between certain characters

I have the following link structure (example, link can't be joined):

https://zoom.us/j/345678634?pwd=fdgSDdfdfasgdgJEeXNaRjNBZz09

My goal is to extract two numbers in two different cells

First one: 345678634

I tried: (?<=/j/).(?=?pwd)

Second one: fdgSDdfdfasgdgJEeXNaRjNBZz09

I tried (besides others): (?<=?pwd).

What I thought about is for the second one just everything that's behind ?pwd= and for the first one everything that's between /j/ and ?pwd=. I just don't know how to get this done with regex.

Upvotes: 0

Views: 286

Answers (2)

user7571182
user7571182

Reputation:

You may try:

.*?\/j\/(\d+)\?pwd=(\w+)

Explanation of the above regex:

  • .*? - Matches lazily everything before j.
  • \/j\/ - Matches /j/ literally.
  • (\d+) - Represents first capturing group matching digits 1 or more times.
  • \? - Matches ? literally.
  • pwd= - Matches pwd= literally.
  • (\w+) - Represents second capturing group capturing the word characters i.e. [0-9a-zA-Z_] one or more times.

enter image description here

You can find the demo of the above regex in here.

Result

Upvotes: 2

JvdV
JvdV

Reputation: 75840

Unfortunately lookarounds are not supported (AFAIK) in RE2. But it seems like you could use:

=REGEXEXTRACT(A1,"(\d+).*=(.*)")

enter image description here

  • ( - Open 1st capture group.
    • \d+ - Match at least a single digit.
    • ) - Close 1st capture group.
  • .* - Match zero or more characters (greedy)
  • = - Match a literal =.
  • ( - Open 2nd capture group.
    • .* - Match any character other than newline zero or more times.
    • ) - Close 2nd capture group.

Because of the spill feature both groups will be extracted into neighboring cells.


A 2nd option, if you want to avoid REGEX, is using SPLIT and QUERY. However, depending on your data, I'm not sure which one would be faster in processing:

=QUERY(SPLIT(SUBSTITUTE(A1,"?pwd=","/"),"/"),"Select Col4,Col5")

Upvotes: 1

Related Questions