Cotton-Eye Joe
Cotton-Eye Joe

Reputation: 75

Python - Read file and count repeated elements

I have a text file with lists of numbers like this:
1
2 5 3
3 5
4
5
Each number is a node of a tree. When there are more than one numbers in a line, it means that the first number has the following numbers linnked to it.
1 doesn´t have any numbers after it, so it doesn't have any numbers linked.
2 has 3 and 5 linked to it.
3 has 5 linked to it and its lnked itself to 2.
4 doesn't have any numbers linked to it.
5 doesn't have any numbers linked to it, but it's linked to 3 and 2.
Since 2, 3 and 5 are linked together, they form a component. 1 and 4 are not linked and have no numbers linked to them, so they form a component each.
So, there are 3 components in total.
How would you determine the number of components? I've had a hard time with for loops and conditionals.

def components(self):
    elm = 0
    with open('file.txt','r') as f:
        for line in f:
            comp = list(line)
            for x in comp:
                if comp[x] != comp[x+1]:
                    elm += 1
                else:
                    pass
    print(elm)

I tried with the code above. I but, when I run it, I get the next message in the function execution:

 components missing 1 required positional argument: 'self'

It may be necessary to mention that I'm working with classes and I'm barely new to this things.

Upvotes: 1

Views: 153

Answers (1)

Joe Iddon
Joe Iddon

Reputation: 20414

You were going about this the right way using for-loops, but you seem to have gotten a little confused in what you loop through! If I understand what you want to achieve correctly, I think I have written the code write to work properly.

With a text file named file.txt with contents:

1
253
35
4
5

the following code will create a list of components and then print out how many components there are at the end:

components = []
with open("file.txt", "r") as f:
    for line in f:
        line = [int(i) for i in line.strip()]
        newComponent = True
        for comp in components:
            if not newComponent:
                break
            for ele in line:
                if ele in comp:
                    comp += line
                    newComponent = False
                    break
        components = [list(set(c)) for c in components]
        if newComponent:
            components.append(line)

print(len(components))

which outputs what you wanted:

3

The code starts by first opening the text file into f. Then we begin our first loop which shall go through each line in the file. We convert this line into a list of ints using a list-comprehension on line.strip() (.strip() just removes the new-line char from the end.

We then define a bool - newComponents - which is initialised as True as we assume that this line will have no links.

Next, we loop through each component in our list components. The first thing we do here is just quickly check if we have already previously found a component that this line is linked to. If we have, we just break out of this loop.

Otherwise, if we're not already linked, we go through each element in our line and check if that element is in the component we are currently looping on. If it is, we concatenate (with the +) our line onto that component, set out bool newComponent flag false (as we have a link) and break out of this loop as we have found a link.

After this, the line: components = [list(set(c)) for c in components] simply goes through the components and removes duplicates from each link. So for instance, if 3 is linked to 2 and we just previously added 3 and 5 to that component, there would now be 2 3s in that component - a duplicate. This line just removes those duplicates. Strictly, this line is not necessary as we would still get the same result, but I just thought it would neaten up the code if you wanted to use the components later.

Finally, if no links were found (newComponent is still True), we just append this whole line (as they are linked) to the components list.

And that's it! We print() the length with len() at the end and you get your result.

Hope this is of use to you!

Update

If the contents of file.txt are numbers with multiple digits, you can separate them with a space:

11
2 45
45 67
8
91

then all we have to do is to add in a .split() on the end of the list-comprehension:

components = []
with open("file.txt", "r") as f:
    for line in f:
        line = [int(i) for i in line.strip().split(' ')]
        ...

What this does is take the string of the line and instead of looping through each char in the string, we make a list from splitting the string at each space and iterate through that. To demonstrate this:

"123 456 789".split(" ")

gives:

["123", "456", "789"]

Upvotes: 1

Related Questions