Daniel
Daniel

Reputation: 11

Edit URL, keep domain part, strip others

I've got a File full of URLs. Each line has one URL. I just want to keep the protocol and domain part.

Example:

https://example0.com/example.php?id=example0
https://example1.com/example.php?id=example1
https://example2.com/example.php?id=example2

Should be formatted to:

https://example0.com/
https://example1.com/
https://example2.com/

I'm using Linux Terminal, so Bash would be the best i think. I already heard of sed but i don't know how to use it or how to use expressions.

Upvotes: 0

Views: 36

Answers (3)

Lars Fischer
Lars Fischer

Reputation: 10149

You could use cut like this:

cut -d/ -f1-3 yourfile

It uses / as delimiter and selects the fields 1 to 3 (// beeing the empty field 2).

And if you really need the trailing slash, you could pipe everything to sed to add a / by adding this to the command:

| sed "s+$+/+g" ` 

Upvotes: 0

Cyrus
Cyrus

Reputation: 88581

With GNU sed:

sed -r 's|([^/]*//[^/]*/).*|\1|' file

Output:

    https://example0.com/
    https://example1.com/
    https://example2.com/

If you want to edit your file "in place" use sed's option -i.


See: The Stack Overflow Regular Expressions FAQ

Upvotes: 1

Maslo
Maslo

Reputation: 282

Try the following

https?:\/\/[^\/]+

https://regex101.com/r/8MdA6I/1

Upvotes: 0

Related Questions