Reputation: 24721
I have a file with contents as this:
- 2 equal files of size 288903252
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 277436598
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
I want to delete those lines with - X equal files of size
without having actual file paths following them. For example first and third bullet point:
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
I formed a regex that matches these lines:
(^-.*\n)-
which can be checked in action at above link. I want to delete that first group which is essentially the whole line. But not able to guess how do I do the same with grep
or sed
. Can we do this in single command?
Upvotes: 0
Views: 208
Reputation: 11216
Using sed
sed '/^-/{N;/\n-/D}' file
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
Portable version for any version of sed
sed -e '/^-/{N' -e '/\
-/D' -e '}' file
If you want to remove the last line if it is -
sed -e '/^-/{$d' -e 'N' -e '/\
-/D' -e '}' file
Upvotes: 2
Reputation: 203189
sed is for simple substitutions on individual lines, that is all. For anything else you should be using awk. If you are using sed constructs other than s, g, and p (with -n) then you are using constructs that became obsolete in the mid-1970s when awk was invented.
This will work robustly, efficiently, and portably with any awk on any UNIX box:
$ awk '/^ /{print p $0; p=""; next} {p=$0 ORS}' file
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
Upvotes: 0
Reputation: 20467
Is pepsi perl okay?
cat input.txt | perl -pe 'BEGIN{undef $/;} s/^-.*?\n-/-/smg'
The BEGIN
block allows the multiline search by essentially telling perl that there is no end of line character. Then the s/
part will substitute any part matching your regex with a -
(no need for a capturing group).
Oh, and I slightly modified your regex to be greedy, with a ?
. Otherwise, the search being multiline, it would match from the first -
to the last one, and remove almost everything.
Edit: here is a lengthy and informative Q/A about multiline search, that shows it will be difficult with sed
.
Edit2: actually quite easy with a modern sed
, see @123's answer
Upvotes: 0
Reputation: 25769
You can just grep it:
grep -v -B1 "^-" test_file.txt | grep -v "\-\-"
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
How it works? It's merely selecting all lines and the lines before them that don't start with a -
. The second grep just removes the group separator, some grep versions support --no-group-separator
so you can do it in one go.
Upvotes: 1