user3062260
user3062260

Reputation: 1644

getting python object to initiate a feature based on a supplied arguement

I am very new to OOP so this question may look very amatuer to experienced OOP people. I have a number of text files up to 250M lines long and I am planning to generate reports based on values in the columns of these files. The files look as follows:

chr1    54071   5   0   8   0
chr1    54072   5   0   9   0
chr1    54073   5   0   9   0
chr1    54074   5   0   9   0
chr1    54075   5   0   9   0
chr1    54076   5   0   9   0
chr1    54077   5   0   9   0
chr1    54078   5   0   9   0
chr1    54079   5   0   10  0
chr1    54080   5   0   10  0
chr1    54081   5   0   10  0
chr1    54082   5   0   10  0
chr1    54083   5   0   10  0
chr1    54084   5   0   10  0
chr1    54085   5   0   11  0
chr1    54086   5   0   11  0
chr1    54087   5   0   11  0
chr1    54088   5   0   11  0
chr1    54089   5   0   12  0

Where col1 is a chromosome, col2 is a position in the chromosome (from 1-250M), the remaining cols are samples and the value for each sample at a given position.

The function is supplied with 2 arguments, one is the file containing data such as in the example above, the other is a list of samples such as: ["AE","BE","HE","C"] in the order that they appear in the data file.

The report should generate a summary output for each sample and each combination of samples where the value of the sample col is larger than a given value, say '2' for instance. The report looks like:

Sample  BasesCovered    FractionOfTotal
AE  43954   0.43954
BE  18728   0.18728
HE  33780   0.3378
C   8108    0.08108
AE:BE   17576   0.17576
AE:HE   28818   0.28818
AE:C    7268    0.07268
BE:HE   13694   0.13694
BE:C    4349    0.04349
HE:C    4827    0.04827
AE:BE:HE    12873   0.12873
AE:BE:C 4263    0.04263
AE:HE:C 4634    0.04634
BE:HE:C 2831    0.02831
AE:BE:HE:C  2750    0.0275
TotalSize   100000  1.00

I have achieved this using a generator and functional programming but would like to learn OOP so am trying to implement this in OOP by making a 'report' object that gets updated with each yield of the generator. My functional code to initiate the report looks like this:

def initiate_overlap_dict(SAMPLE_LIST):
    # Takes a list or a string and converts it into a dict of all combinations, initiates the value of the dict as integer 0
    if len(SAMPLE_LIST)==1 and type(SAMPLE_LIST)==list:
        return {SAMPLE_LIST[0]: 0}
    elif len(SAMPLE_LIST)==0 and type(SAMPLE_LIST)==list:
        raise Exception('"SAMPLE_LIST" needs to contain samples!')
    elif type(SAMPLE_LIST) != list:
        raise Exception('"SAMPLE_LIST" must be a list of length >=1 in the same order as they appear in the depth_file')
    else:
        sample_list=[str(x) for x in SAMPLE_LIST]
        out={}

        for s in sample_list:
            out[s]=0
        for c in range(2,len(sample_list)+1):
            for s in combinations(sample_list,c):
                out[':'.join(s)]=0
        return out

I simply call this at the start of the program and then update it with each yield of the generator. I'd like to do something similar with OOP and have tried the following:

from itertools import combinations

class CoverageReport(object):

    # CONSTRUCTOR
    def __init__(self, samples):
        self.samples = samples    # list of samples
        self.coverage = self.initiate_overlap_dict(self)
    # REPRESENTATION METHOD: WHAT WILL BE PRINTED BY DEFAULT IF THE OBJECT IS CALLED
    def __repr__(self):
        return '<The following samples are examined for coverage: ' + self.samples +'>'

    def initiate_overlap_dict(self):
        # Takes a list or a string and converts it into a dict of all combinations, initiates the value of the dict as integer 0
        if len(self.samples)==1 and type(self.samples)==list:
            return {self.samples[0]: 0}
        elif len(self.samples)==0 and type(self.samples)==list:
            raise Exception('"SAMPLE_LIST" needs to contain samples!')
        elif type(self.samples) != list:
            raise Exception('"SAMPLE_LIST" must be a list of length >=1 in the same order as they appear in the depth_file')
        else:
            sample_list=[str(x) for x in self.samples]
            out={}

            for s in sample_list:
                out[s]=0
            for c in range(2,len(sample_list)+1):
                for s in combinations(sample_list,c):
                    out[':'.join(s)]=0
            return out


report=CoverageReport(["AE","BE","HE","C"]) 

Basically I'm trying to get the object to initiate itself with values of 0 for each item and combination of items in the list so that I can then make an update method that will update for each iteration of the generator. Its throwing the following error:

TypeError: initiate_overlap_dict() takes 1 positional argument but 2 were given

I figure this is something to do with trying to initiate self.coverage in the init and not giving an arguement to create the object - is there a way to do this using the list (self.samples)? As this should be all it needs to initiate an empty/unpopulated report?

Is there a way to do this? I'm sure someone with even basic OOP skills can answer this fairly easily? I'm a bit stumped with what exactly to search for is all. Many thanks

Upvotes: 0

Views: 24

Answers (1)

Hagai
Hagai

Reputation: 703

You don't need to call self.initiate_overlap_dict(self), you only need to call self.initiate_overlap_dict().
i.e. remove the self argument
You can read more here https://pythontips.com/2013/08/07/the-self-variable-in-python-explained/

Upvotes: 1

Related Questions