Elsa James
Elsa James

Reputation: 91

Is there any way to delete all characters before the GET word?

I am using sublime text on windows and I have something like this.

race-position-18374327498.png:1 GET   https://google.com/example.png
race-position-453452.png:1 GET https://google.com/example1.png
race-position-343532.png:1 GET  https://google.com/example.png
race-position-4543646554764576574564.png: GET https://google.com/example22.png
race-position-5765865865843655.png: GET https://google.com/example434.png

I want to get rid of everything before the GET word so I want the output like this.

https://google.com/example.png
https://google.com/example1.png
https://google.com/example.png
https://google.com/example22.png
https://google.com/example434.png

Is there any way to do this in any software? Help me. Thanks in advance.

Upvotes: 1

Views: 3131

Answers (3)

Angrej Kumar
Angrej Kumar

Reputation: 958

In Sublime Text, you can also select the text " GET " and ctrl+D to select all " GET ", and then just press right arrow so yoir cursor will be at the end of GET word, then just shift+home. It will select all th content before GET. And then you can just delete it.

Though this won't be useful for longer file. But for for quickly do it for some lines you could use it. Its handy to do it.

Upvotes: 0

Daniel Brose
Daniel Brose

Reputation: 1403

For the data set you gave, which included lines without GET in them, simple find+replace regex

You can get to this using 'Find' then 'Replace' in top menu, or hitting "ctrl+h" anytime

  1. FIND .*\shttp

  2. REPLACE http

So it replaces any char until it finds a (whitespace)http match

The whitespace is a sanity check, since the left hand side might have "http" there, but highly unlikely to have whitespace following by http.

EDIT

@Robert Mennell comment made me realise i dont know if left hand side can contain whitespace, so here is improved regex for you to handle that

To be clear, both version work on the OP dataset, the improvement likely handles better if the simpler regex doesnt quite work on the full actual dataset now or in future :)

Feel free to use either though, i left the other one just above.

FIND ^(.*)\shttp([^\s]*)$

REPLACE http\2

In Regex:

  • . means any character
  • * means 0-many
  • \s is for whitespace
  • ( and ) define groups
  • \1, \2 ect is how you call back to those groups
  • ^ by itself is line start
  • [ and ] is a character group
  • [^ means a negative character group (so any character but these)
  • $ is line end

The line start and ends just ensure each row treated seperate, and it handles whitespace on the left side by making sure the http has no whitespace AFTER it UNTIL end of line, using the [^\s]*, meaning any number of non-whitespace characters.

Using the \2 in the replace puts all the text in that second ( ) group back in again.

So it handles http, https, and any characters after that too, and will only every retain content in the last right hand part of each line.

You can use even more flavourful versions to do same outcome, however in sublime text 3 at least, find replace tool already has default modifiers so . wont replace newline characters and will find multiple matches, so its a very simple operation :)

Here is a great cheatsheet for regex as implemented by sublime text: https://jdhao.github.io/2019/02/28/sublime_text_regex_cheat_sheet/

Upvotes: 2

Robert Mennell
Robert Mennell

Reputation: 2052

actually what you want is to delete everything before the http part of the uri. To do that use the find and replace box and use a regex for ^.*http and replace it with http and it should remove them all

^ beginning of line  
.*  
  . any character
  * repeated
http string of `http`

this will match on any line that has a http in it(meaning it's also compatible with https) and all characters before it in a line and will replace them with http

Documentation on the NotePad++ website about regular expressions

Upvotes: 2

Related Questions