Ramanan T
Ramanan T

Reputation: 383

extract server name from git clone URL

I have this

$ cat test.sh
echo "https://bitbucket.dev.global.server.com/scm/xyz/abd.git"
echo "ssh://[email protected]:6699/xyz/abc.git"
echo "http://bitbucket.dev.global.server.com/abc"
echo "ssh://[email protected]/xyz/abc"
echo "http://bitbucket.dev.global.server.com"
echo "ssh://[email protected]/xyz/abc.git"

I want an one liner command (preferably sed command) that will extract the server name from the URL. e.g.

bitbucket.dev.global.server.com

I tried this but it doesn't work

$ ./test.sh | sed 's/\(\/\/\|\@\)/&\n/;s/.*\n//;s/\(\:\|\/\)/\n&/;s/\n.*//'
bitbucket.dev.global.server.com
[email protected]
bitbucket.dev.global.server.com
[email protected]
bitbucket.dev.global.server.com
[email protected]

It still got the user and @ symbol. How to do this?

Upvotes: 3

Views: 70

Answers (3)

Robert Long
Robert Long

Reputation: 6877

I want an one liner command (preferably sed command)

Since you requested a sed one-liner, you can try this:

./test.sh | sed -E 's|.*://([^/@:]+@)?([^/@:]+).*|\2|'

which outputs this:

bitbucket.dev.global.server.com
bitbucket.dev.global.server.com
bitbucket.dev.global.server.com
bitbucket.dev.global.server.com
bitbucket.dev.global.server.com
bitbucket.dev.global.server.com

Explanation:

./test.sh | sed -E '
  s| .*://        # Match all up to "://"
     ([^/@:]+@)?  # Match "user@" or "git@"
     ([^/@:]+)    # The hostname
     .*           # Match all after the hostname and discard it
  |\2|'          # Replace full match with hostname, and then it's done

Upvotes: 2

Gilles Quénot
Gilles Quénot

Reputation: 185600

Try this, using perl, sed like syntax:

bash test.sh | perl -pe 's|.*://(?:\w+@)?([\w.]+)/?.*|\1|'

The regular expression matches as follows:

Node Explanation
.* any character except \n (0 or more times (matching the most amount possible))
:// '://'
(?: group, but do not capture (optional (matching the most amount possible)):
\w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible))
@ @
)? end of grouping
( group and capture to \1:
[\w.]+ any character of: word characters (a-z, A-Z, 0-9, _), '.' (1 or more times (matching the most amount possible))
) end of \1
/? '/' (optional (matching the most amount possible))
.* any character except \n (0 or more times (matching the most amount possible))

or with GNU grep in PCRE mode:

bash test.sh | grep -oP '://(?:\w+@)?\K[\w.]+(?>=/)?' 

The regular expression matches as follows:

Node Explanation
:// '://'
(?: group, but do not capture (optional (matching the most amount possible)):
\w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible))
@ @
)? end of grouping
\K resets the start of the match (what is Kept) as a shorter alternative to using a look-behind assertion: look arounds and Support of \K in regex
[\w.]+ any character of: word characters (a-z, A- Z, 0-9, _), '.' (1 or more times (matching the most amount possible))
(?> match (and do not backtrack afterwards) (optional (matching the most amount possible)):
=/ '=/'
)? end of look-ahead

yields:

bitbucket.dev.global.server.com
bitbucket.dev.global.server.com
bitbucket.dev.global.server.com
bitbucket.dev.global.server.com
bitbucket.dev.global.server.com
bitbucket.dev.global.server.com

Upvotes: 1

Lajos Arpad
Lajos Arpad

Reputation: 76943

You can do this:

echo "https://bitbucket.dev.global.server.com/scm/xyz/abd.git" | sed 's/$/\//'  | grep -Eo /[^/]*/[^/]*/ | head -1 | sed 's|[/]||g'

Explanation:

  • we echo the string
  • we add a / at the end to have consistent format
  • we grep out the pattern of
    • /
    • anything different from /
    • /
    • anything different from /
    • /
  • we get the first match
  • we remove the / characters

Upvotes: 0

Related Questions