Josh Allen
Josh Allen

Reputation: 997

Grep invert on string matched, not line matched

I'll keep this explanation of why I need help to a mimimum. One of my file directories got hacked through XSS and placed a long string at the beginning of all php files. I've tried to use sed to replace the string with nothing but it won't work because the pattern to match includes many many characters that would need to be escaped.

I found out that I can use fgrep to match a fixed string saved in a pattern file, but I'd like to replace the matched string (NOT THE LINE) in each file, but grep's -v inverts the result on the line, rather than the end of the matched string.

This is the command I'm using on an example file that contains the hacked

fgrep -v -f ~/hacked-string.txt example.php

I need the output to contain the <?php that's at the end of the line (sometimes it's a <style> tag), but the -v option inverts at the end of that line, so the output doesn't contain the <?php at the beginning.

NOTE

I've tried to use the -o or --only-matching which outputs nothing instead:

fgrep -f ~/hacked-string.txt example.php --only-matching -v

Is there another option in grep that I can use to invert on the end of the matched pattern, rather than the line where the pattern was matched? Or alternatively, is there an easier option to replace the hacked string in all .php files?

Here is a small snippet of what's in hacked-string.txt (line breaks added for readability):

]55Ld]55#*<%x5c%x7825bG9}:}.}-}!#*<%x55c%x7825)
dfyfR%x5c%x7827tfs%x5c%x7c%x785c%x5c%x7825j:^<!
%x5c%x7825w%x5c%x7860%x5c%x785c^>Ew:25tww**WYsb
oepn)%x5c%x7825bss-%x5c%x7825r%x5c%x7878B%x5c%x
7825h>#]y3860msvd},;uqpuft%x5c%x7860msvd}+;!>!}
%x5c%x7827;!%x5c%x7825V%x5c%x7827{ftmfV%x5e56+9
9386c6f+9f5d816:+946:ce44#)zbssb!>!ssbnpe_GMFT%
x5c5c%x782f#00#W~!%x5c%x7825t2w)##Qtjw)#]82#-#!
#-%x5c%x7825tmw)%x5c%x78w6*%x5c%x787f_*#fubfsdX
k5%x5c%xf2!>!bssbz)%x5c%x7824]25%x5c%x7824-8257
-K)fujs%x5c%x7878X6<#o]o]Y%x5c%x78257;utpI#7>-1
-bubE{h%x5c%x7825)sutcvt)!gj!|!*bubEpqsut>j%x5c
%x7825!*72!%x5c%x7827!hmg%x5c%x78225>2q%x5c%x7

Thanks in advance!

Upvotes: 2

Views: 530

Answers (3)

glenn jackman
glenn jackman

Reputation: 246847

With perl:

perl -i.hacked -pe "s/\Q$(<hacked-string.txt)\E//g" example.php

Notes:

  • The $(<file) bit is a bash shortcut to read the contents of a file.
  • The \Q and \E bits are from perl, they treat the stuff in between as plain characters, ignoring regex metachars.
  • The -i.hacked option will edit the file in-place, creating a backup "example.php.hacked"

Upvotes: 0

Zajo
Zajo

Reputation: 372

Is the hacked string the same in every file?

If the length of hacked string in chars was 1234 then you can use

tail -c +1235 file.php > fixed-file.php

for each infected file.

Note that tail c +1235 tells to start output at 1235th character of the input file.

Upvotes: 0

rici
rici

Reputation: 241771

I think what you are asking is this:

"Is it possible to use the grep utility to remove all instances of a fixed string (which might contain lots of regex metacharacters) from a file?"

In that case, the answer is "No".

What I think you wanted to ask was:

"What is the easiest way to remove all instances of a fixed string (which might contain lots of regex metacharacters) from a file?"

Here's one reasonably simple solution:

delete_string() {
  awk -v s="$the_string" '{while(i=index($0,s))$0=substr($0,1,i-1)substr($0,i+length(s))}1'
}

delete_string 'some_hideous_string_with*!"_inside' < original_file > new_file

The shell syntax is slightly fragile; it will break if the string contains an apostrophe ('). However, you can read a raw string from stdin into a variable with:

$ IFS= read -r the_string
absolutely anything here

which will work with any string which doesn't contain a newline or a NUL character. Once you have the string in a variable, you can use the above function:

delete_string "$the_string" < original_file > new_file

Here's another possible one liner, using python:

delete_string() {
  python -c 'import sys;[sys.stdout.write(l.replace(r"""'"$1"'""","")) for l in sys.stdin]'
}

This won't handle strings which have three consecutive quotes (""").

Upvotes: 2

Related Questions