Josh Friedlander
Josh Friedlander

Reputation: 11657

Extract similar lines from multiple files in folder

I have a directory with about 30 Python files with a similar pattern, something like this:

import stuff

class BarFoo001(BarFooBase):

    info = self.info
    description = 'here's the stuff I want'
    IS_CRITICAL = true

    def method(sdf):
        etc...

I'd like to extract from each file just the lines with class name and description (just text as a reference, I don't need a working Python file).

My first thought was to do this with shell tools. I used cat *.py > all.py, and then tried to sed -i -e 's/BarFooBase\(.*\)IS_CRITICAL/\1/' all.py, but this seemed to have no effect. I've also tried using RegEx in my IDE and finally in Python (re.sub('IS_CRITICAL[^>]+\nclass Bar', '', my_string)), but none of these gave me my desired results. What's wrong with my Regex? Also, is there a simpler way to do this that I'm missing?

Here would be a good enough output:

class BarFoo001(BarFooBase):

info = self.info
description = 'here's the stuff I want'
IS_CRITICAL

Upvotes: 0

Views: 156

Answers (5)

Ed Morton
Ed Morton

Reputation: 203655

$ grep -E '^[[:space:]]*(class|description)[[:space:]]' file
class BarFoo001(BarFooBase):
    description = 'here's the stuff I want'

$ awk 'sub(/^[[:space:]]*(class|description =)[[:space:]]+/,"")' file
BarFoo001(BarFooBase):
'here's the stuff I want'

Upvotes: 1

RavinderSingh13
RavinderSingh13

Reputation: 133528

Could you please try following. It should run in all kind of awk versions though couldn't test in all versions or different O.S systems.

awk '
{
  sub(/^ +/,"")
}
/class/{
  found=1
}
/IS_CRITICAL/ && found{
  sub(/ =.*/,"")
  print
  found=""
}
found
'  Input_file

Upvotes: 1

stack0114106
stack0114106

Reputation: 8711

Using Perl one-liner

 perl -0777 -ne ' while( /(\bclass\s*.+?IS_CRITICAL)/gs ) { print "$1\n" } ' 

with inputs:

$ cat josh.py
import stuff

class BarFoo001(BarFooBase):

    info = self.info
    description = 'here's the stuff I want'
    IS_CRITICAL = true

    def method(sdf):
        etc...
    def method2(fddf):
        print
$ perl -0777 -ne ' while( /(\bclass\s*.+?IS_CRITICAL)/gs ) { print "$1\n" } ' josh.py
class BarFoo001(BarFooBase):

    info = self.info
    description = 'here's the stuff I want'
    IS_CRITICAL
$

For searching multiple files, you can use

perl -0777 -ne ' while( /(\bclass\s*.+?IS_CRITICAL)/gs ) { print "$ARGV:$1\n" } ' *py

Upvotes: 1

SLePort
SLePort

Reputation: 15461

With sed, you can use address range to output blocks of line:

sed -n '/^[[:blank:]]*class[[:blank:]]/,/IS_CRITICAL/p' file.py

Edit:

Added [[:blank:]] before and after class to match only class definitions preceded by zero or more space or tab.

Upvotes: 2

Tyl
Tyl

Reputation: 5252

Try this, see if the results are what you want (GNU awk):

awk '/IS_CRITICAL/{sub(/IS_CRITICAL.*/,"IS_CRITICAL");print "class " $0}' RS="class " all.py

Upvotes: 1

Related Questions