user2410854
user2410854

Reputation: 139

Using command line to remove lines from text file

I have a text file and need to remove all lines that DO NOT contain http in them. Alternatively, it could just output all the files that DO contain http in them to the new file.

The name of my original file is list.txt and I need to generate a new file with a name like new.txt

I know that there are several ways to do this via command line, but what I'm really looking for is the quickest way since I need to do this with several files and each of them are a few gigs in size...

Upvotes: 0

Views: 2266

Answers (3)

ChuckCottrill
ChuckCottrill

Reputation: 4444

The quickest, shortest solution,

fgrep -v "http"

Of course, grep, egrep, awk, perl, etc make this more fungible.

Here is a short shell script. Edit "delhttp.sh" containing,

#!/bin/bash
if [ $# -eq 0 ] ; then
    fgrep -v "http"
elif [ $# -eq 1 ] ; then
    f1=${1:-"null"}
    if [ ! -f $f1 ]; then echo "file $f1 dne"; exit 1; fi
    fgrep -v "http" $f1 #> $f2
elif [ $# -eq 2 ]; then
    f1=${1:-"null"}
    if [ ! -f $f1 ]; then echo "file $f1 dne"; exit 1; fi
    f2=${2:-"null"}
    fgrep -v "http" $f1 > $f2
fi

Then make this file executable using,

chmod +x delhttp.sh

Here is a perl script (if you prefer), Edit "delhttp.pl" containing,

#!/bin/env perl
use strict;
use warnings;
my $f1=$ARGV[0]||"-";
my $f2=$ARGV[1]||"-";
my ($fh, $ofh);
open($fh,"<$f1") or die "file $f1 failed";
open($ofh,">$f2") or die "file $f2 failed";
while(<$fh>) { if( !($_ =~ /http/) ) { print $ofh "$_"; } }

Again, make this file executable using,

chmod +x delhttp.pl

Upvotes: 2

Vijay
Vijay

Reputation: 67211

perl -i -lne 'print if(/http/)' your_file

This above command will delete all the lines from the file if they do not have http. If you insist on keeping the original file backup, the you can anyhow give and option of ".bak" like mentioned below:

perl -i.bak -lne 'print if(/http/)' your_file

By this your_file.bak will be generated which is nothing but a copy of the original file and original file will be modified according to your need. Also you can use awk:

awk '/http/' your_file

This will out put to the console. You can anyhow use '>' to store the output in a new file.

Upvotes: 2

hwnd
hwnd

Reputation: 70722

You could use grep. Using -v inverts the sense of matching, to select non-matching lines.

grep -v 'http' list.txt

Using Perl one-liner:

perl -ne '/^(?:(?!http).)*$/ and print' list.txt > new.txt

Upvotes: 1

Related Questions