delete lines in file not matching the pattern

Question

I am trying to migrate data which consists of a lot of separate text files. One step is to delete all lines in the text files, which are not used anymore. The lines are key-value-pairs. I want to delete everything in a file except those lines with certain keys. I do not know the order of the keys inside of the file.

The keys I want to keep are e.g. version, date and number.

I found this question Remove all lines except matching pattern line best practice (sed) and tried the accepted answer. My sed command is

sed '/^(version=.*$)|(date=.*$)|(number=.*$)/!d' file.txt

with a !d after the address to delete all lines NOT matching the pattern.

Example of the regex: https://regex101.com/r/LKfxpP/2

but it keeps deleting all lines in my file. Where is my mistake? I assume I am wrong with my regex, but whats the error here?

Wiktor Stribiżew · Accepted Answer

You may use

sed '/^$version\|date\|number$=/!d' file.txt > newfile.txt

The BRE POSIX pattern here matches

^ - start of a line
$version\|date\|number$ - a group matching
- version - a version string
- \| - or
- date - a date string
- \| - or
- number - a number string
= - a = char.

Or, use a POSIX ERE syntax enabled with -E option:

sed -E '/^(version|date|number)=/!d' file.txt > newfile.txt

Here, the alternation operator | and capturing parentheses do not need escaping.

See an online demo.

delete lines in file not matching the pattern

Answers (2)

Related Questions