Reputation: 75
So let's say I got a list of 200 numeric values in List A. I want to make a list B that splits list A in clusters of 4, so I would get 50 clusters. In list B I want to make a list for every cluster of 4 values, so it would contain 50 lists in list B.
I'll explain my problem using my source:
from pprint import pprint
FileValuelist = []
def DetermineClusterNumber(File): #determine digits in a cluster
Lines = open(File, "r")
i = 0 # used for iterating through the lines
FirstLine = Lines.readline()
for char in FirstLine: # read through first line, till hyphen.
if char != "-":
i += 1
elif char == "-":
return i # Return number of digits in the cluster
def ReadLines(File, Cluster_Number):
Lines = open( File, "r" )
for Line in Lines:
for char in Line:
if char != "-":
FileValuelist.append(char)
def RemoveNewlines(Rawlist):
for x in range(len(FileValuelist)-9):
if FileValuelist[x] == "\n":
FileValuelist.remove(FileValuelist[x])
if FileValuelist[x] == "\r":
FileValuelist.remove(FileValuelist[x])
Cluster_Number = DetermineClusterNumber("Serials.txt") # Amount of chars in a cluster. Example: 1234-2344-2345. clusternumber = 4
ReadLines ("Serials.txt", Cluster_Number)
RemoveNewlines(FileValuelist)
list_iterater = 0
FinishedList = ([[None]*(Cluster_Number)])*((len(FileValuelist)))
amount_of_clusters = len(FileValuelist)/Cluster_Number
for x in range(0, amount_of_clusters):
for y in range(0, Cluster_Number):
FinishedList[x][y] = FileValuelist[list_iterater]
list_iterater += 1
pprint(FinishedList)
With serials.txt containing:
4758-8345-1970-4486-2348
2346-1233-3463-7856-4572
6546-6874-1389-9842-4185
9896-4688-4689-6455-4712
9541-5621-8414-7465-5741
4545-9959-5632-6845-1351
5643-2435-5854-6754-8749
7892-3457-8923-4572-5397
5623-5698-5468-5476-9874
8762-3487-6123-7861-2679
When I run this, I would expect it to print serials.txt in a list, containing the 50 split 50 lists. However when I run it, it prints out [2,6,7,8] fifty times. That's the last cluster. So I guess the problem is somewhere located at line 39. I already tried to look what value was assigned to FinishedList at line 41, and it was the right value everytime (so not 2,6,7,9, like when the list is printed out). I already rechecked the x and y iteraters (yes, I do know it's spelled iterator) and they are correct too.
So what is wrong in my code that makes it print the last cluster fifty times? I'm using Python 2.7 by the way, if you couldn't tell.
Thanks in advance!
Upvotes: 2
Views: 156
Reputation: 9584
Why do you do it in a so complicated way? You can accomplish what you want with two lines of code:
>>> with open('serials.txt') as data:
... clusters = [[int(digit) for digit in cluster] for line in data for cluster in line.strip().split('-')]
Then clusters
contains:
[
[4, 7, 5, 8],
[8, 3, 4, 5],
[1, 9, 7, 0],
# ...
]
Upvotes: 0
Reputation: 5149
Dude, no offense but your code is horribly unpythonic - look for a few tutorials on code style and lists. This whole problem (if I understand it correctly) can be solved with a few simple lines of code.
As far as I understand, you want to turn each four-digit value in the file into a list of its digits and store these digits in another list, meaning for the input
"1234-5678-9999"
the result should be
[[1,2,3,4], [5,6,7,8], [9,9,9,9]]
This can be achieved as easy as this:
with open("serials.txt") as f:
clusters = [c for line in f for c in line.strip().split("-")]
digits = [list(c) for c in clusters]
Digits now contains a list of characters for each cluster. If you need the values as integers you could change list(c)
to a nested list comprehension like [int(x) for x in c]
.
Upvotes: 0
Reputation: 26160
The way you initialize FinishedList
, you end up with a list full of references to the same sublist. When you then go and assign to [x][y]
, you are overwriting the same value that is referenced over and over each time. You don't need to initialize lists in Python, so try just using append()
in your second loop.
for x in range(amount_of_clusters):
offset = x * Cluster_Number
FinishedList.append(FileValueList[offset:offset + Cluster_Number])
Upvotes: 0
Reputation: 70552
This line isn't doing what you think it's doing:
FinishedList = ([[None]*(Cluster_Number)])*((len(FileValuelist)))
It's storing the reference to the same [None, None, None... None]
list, len(FileValuelist)
times (the *
operator basically performs a shallow copy). If you want to ensure that it creates new lists, the easiest way is to use a list comprehension.
FinishedList = [[None] * Cluster_Number for _ in xrange(len(FileValuelist))]
Upvotes: 1
Reputation: 16049
The second multiplication on the line FinishedList = ([[None]*(Cluster_Number)])*((len(FileValuelist)))
does not actually create len(FileValuelist)
new lists, just that many pointers to the original list. When you change either of them all change. I asked the same question a while back, see the accepted answer there.
Upvotes: 0