user1910584
user1910584

Reputation: 75

Python, printing a list isn't giving me the output I want

So let's say I got a list of 200 numeric values in List A. I want to make a list B that splits list A in clusters of 4, so I would get 50 clusters. In list B I want to make a list for every cluster of 4 values, so it would contain 50 lists in list B.

I'll explain my problem using my source:

    from pprint import pprint

    FileValuelist = []

    def DetermineClusterNumber(File):               #determine digits in a cluster
            Lines = open(File, "r")
            i = 0 # used for iterating through the lines
            FirstLine = Lines.readline()
            for char in FirstLine:                  # read through first line, till hyphen.
                    if char != "-":
                            i += 1
                    elif char == "-":
                            return i # Return number of digits in the cluster 

    def ReadLines(File, Cluster_Number):
            Lines = open( File, "r" )
            for Line in Lines:
                    for char in Line:
                            if char != "-":
                                            FileValuelist.append(char)

    def RemoveNewlines(Rawlist):
            for x in range(len(FileValuelist)-9):
                    if FileValuelist[x] == "\n":
                            FileValuelist.remove(FileValuelist[x])
                    if FileValuelist[x] == "\r":
                            FileValuelist.remove(FileValuelist[x])


    Cluster_Number = DetermineClusterNumber("Serials.txt") # Amount of chars in a cluster. Example: 1234-2344-2345. clusternumber = 4
    ReadLines ("Serials.txt", Cluster_Number)
    RemoveNewlines(FileValuelist)

    list_iterater = 0

    FinishedList = ([[None]*(Cluster_Number)])*((len(FileValuelist)))
    amount_of_clusters = len(FileValuelist)/Cluster_Number

    for x in range(0, amount_of_clusters):
            for y in range(0, Cluster_Number):
                    FinishedList[x][y] = FileValuelist[list_iterater]
                    list_iterater += 1

    pprint(FinishedList)

With serials.txt containing:

    4758-8345-1970-4486-2348
    2346-1233-3463-7856-4572
    6546-6874-1389-9842-4185
    9896-4688-4689-6455-4712
    9541-5621-8414-7465-5741
    4545-9959-5632-6845-1351
    5643-2435-5854-6754-8749
    7892-3457-8923-4572-5397
    5623-5698-5468-5476-9874
    8762-3487-6123-7861-2679

When I run this, I would expect it to print serials.txt in a list, containing the 50 split 50 lists. However when I run it, it prints out [2,6,7,8] fifty times. That's the last cluster. So I guess the problem is somewhere located at line 39. I already tried to look what value was assigned to FinishedList at line 41, and it was the right value everytime (so not 2,6,7,9, like when the list is printed out). I already rechecked the x and y iteraters (yes, I do know it's spelled iterator) and they are correct too.

So what is wrong in my code that makes it print the last cluster fifty times? I'm using Python 2.7 by the way, if you couldn't tell.

Thanks in advance!

Upvotes: 2

Views: 156

Answers (5)

pemistahl
pemistahl

Reputation: 9584

Why do you do it in a so complicated way? You can accomplish what you want with two lines of code:

>>> with open('serials.txt') as data: 
...    clusters = [[int(digit) for digit in cluster] for line in data for cluster in line.strip().split('-')]

Then clusters contains:

[
    [4, 7, 5, 8],
    [8, 3, 4, 5],
    [1, 9, 7, 0],
    # ...
]

Upvotes: 0

l4mpi
l4mpi

Reputation: 5149

Dude, no offense but your code is horribly unpythonic - look for a few tutorials on code style and lists. This whole problem (if I understand it correctly) can be solved with a few simple lines of code.

As far as I understand, you want to turn each four-digit value in the file into a list of its digits and store these digits in another list, meaning for the input

"1234-5678-9999"

the result should be

[[1,2,3,4], [5,6,7,8], [9,9,9,9]]

This can be achieved as easy as this:

with open("serials.txt") as f:
    clusters = [c for line in f for c in line.strip().split("-")]
    digits = [list(c) for c in clusters]

Digits now contains a list of characters for each cluster. If you need the values as integers you could change list(c) to a nested list comprehension like [int(x) for x in c].

Upvotes: 0

Silas Ray
Silas Ray

Reputation: 26160

The way you initialize FinishedList, you end up with a list full of references to the same sublist. When you then go and assign to [x][y], you are overwriting the same value that is referenced over and over each time. You don't need to initialize lists in Python, so try just using append() in your second loop.

for x in range(amount_of_clusters):
    offset = x * Cluster_Number
    FinishedList.append(FileValueList[offset:offset + Cluster_Number])

Upvotes: 0

voithos
voithos

Reputation: 70552

This line isn't doing what you think it's doing:

FinishedList = ([[None]*(Cluster_Number)])*((len(FileValuelist)))

It's storing the reference to the same [None, None, None... None] list, len(FileValuelist) times (the * operator basically performs a shallow copy). If you want to ensure that it creates new lists, the easiest way is to use a list comprehension.

FinishedList = [[None] * Cluster_Number for _ in xrange(len(FileValuelist))]

Upvotes: 1

mbatchkarov
mbatchkarov

Reputation: 16049

The second multiplication on the line FinishedList = ([[None]*(Cluster_Number)])*((len(FileValuelist))) does not actually create len(FileValuelist) new lists, just that many pointers to the original list. When you change either of them all change. I asked the same question a while back, see the accepted answer there.

Upvotes: 0

Related Questions