zmjackson
zmjackson

Reputation: 67

Reading lines of file input into multiple lists in Python

I'm trying to read lines from the following file into 3 lists where the second column (read to 2 decimal places) contains the x-values and the third column contains the y-values. I want a new list when the line number (first column) starts over at one.

1 0.05 0
2 0.1 0
3 0.15 0
4 0.2 0
5 0.25 0
6 0.3 0
7 0.35 0
8 0.4 1
9 0.45 1
10 0.5 2
11 0.55 3
12 0.6 4
13 0.65 5
14 0.7 6
15 0.75 7
16 0.8 8
17 0.85 8
18 0.9 9
19 0.95 10
20 1 11
1 0.047619 0
2 0.0952381 1
3 0.142857 1
4 0.190476 1
5 0.238095 1
6 0.285714 1
7 0.333333 1
8 0.380952 1
9 0.428571 1
10 0.47619 1
11 0.52381 2
12 0.571429 3
13 0.619048 4
14 0.666667 5
15 0.714286 5
16 0.761905 5
17 0.809524 6
18 0.857143 7
19 0.904762 8
20 0.952381 8
21 1 9
1 0.0526316 0
2 0.105263 0
3 0.157895 0
4 0.210526 0
5 0.263158 1
6 0.315789 1
7 0.368421 1
8 0.421053 1
9 0.473684 2
10 0.526316 3
11 0.578947 3
12 0.631579 3
13 0.684211 4
14 0.736842 5
15 0.789474 5
16 0.842105 6
17 0.894737 7
18 0.947368 8
19 1 9

I have the following code. When I go to plot them, only x1 and y1 are plotted. At first I though it was an issue with my plotting code, but then when I plotted them separately, I found that the lists x2, y2, x3, y3 are empty, leading me to believe the lists are not reading the values properly.

import numpy as np
import matplotlib.pyplot as plt

with open('kmt_oa.txt') as f:
    lines1 = f.readlines()[0:20]
    x1 = [line.split()[1][:4] for line in lines1]
    y1 = [line.split()[2] for line in lines1]
    lines2 = f.readlines()[21:40]
    x2 = [line.split()[1][:4] for line in lines2]
    y2 = [line.split()[2] for line in lines2]
    lines3 = f.readlines()[41:60]
    x3 = [line.split()[1][:4] for line in lines3]
    y3 = [line.split()[2] for line in lines3]
    
ax1 = plt.subplot()
ax1.scatter(x1, y1, color = 'r', label = 'Table size of 20')
ax1.scatter(x2, y2, color = 'g', label = 'Table size of 21')
ax1.scatter(x3, y3, color = 'b', label = 'Table size of 19')

ax1.set_xlabel('Load Factor')
ax1.set_ylabel('Number of Collisions')

ax1.set_title("Key Mod Tablesize & Open Addressing")
ax1.set_ylabel('Number of Collisions')

plt.legend(loc = 'upper left')

plt.show()

Upvotes: 1

Views: 756

Answers (3)

Vasilis G.
Vasilis G.

Reputation: 7859

You can also do it using pure Python without any module:

outList = []
with open('data.txt', 'r') as inFile:
   # Create small lists of each individual record.
   content = [elem.split() for elem in inFile.read().split('\n')]
   # Find the indices of these elements whose first number is 1.
   indices = [i for i in range(len(content)) if content[i][0]=='1'] + [len(content)]
   # Remove the first element of each record.
   content = [elem[1:] for elem in content]
   # Create lists, based on the indices above.
   outList = [sum([],content[indices[i]:indices[i+1]]) for i in range(0,len(indices)-1,1)]

for elem in outList:
    for item in elem:
        print(item)
    print()

Result:

['0.05', '0']
['0.1', '0']
['0.15', '0']
['0.2', '0']
['0.25', '0']
['0.3', '0']
['0.35', '0']
['0.4', '1']
['0.45', '1']
['0.5', '2']
['0.55', '3']
['0.6', '4']
['0.65', '5']
['0.7', '6']
['0.75', '7']
['0.8', '8']
['0.85', '8']
['0.9', '9']
['0.95', '10']
['1', '11']

['0.047619', '0']
['0.0952381', '1']
['0.142857', '1']
['0.190476', '1']
['0.238095', '1']
['0.285714', '1']
['0.333333', '1']
['0.380952', '1']
['0.428571', '1']
['0.47619', '1']
['0.52381', '2']
['0.571429', '3']
['0.619048', '4']
['0.666667', '5']
['0.714286', '5']
['0.761905', '5']
['0.809524', '6']
['0.857143', '7']
['0.904762', '8']
['0.952381', '8']
['1', '9']

['0.0526316', '0']
['0.105263', '0']
['0.157895', '0']
['0.210526', '0']
['0.263158', '1']
['0.315789', '1']
['0.368421', '1']
['0.421053', '1']
['0.473684', '2']
['0.526316', '3']
['0.578947', '3']
['0.631579', '3']
['0.684211', '4']
['0.736842', '5']
['0.789474', '5']
['0.842105', '6']
['0.894737', '7']
['0.947368', '8']
['1', '9']

Upvotes: 1

blhsing
blhsing

Reputation: 107085

You can use csv.reader to read and generate the rows as a sequence, use the enumerate function to generate row indices, and then use itertools.groupby with a key function that returns the difference between a row index and the line number (the first field) so that the list can be grouped by consecutive sequences of line numbers (given your file object f already opened):

from itertools import groupby
import csv
[[[round(float(r[1]), 2), int(r[2])] for _, r in g] for _, g in groupby(enumerate(csv.reader(f, delimiter=' ')), key=lambda t: t[0] - int(t[1][0]))]

This returns:

[[[0.05, 0],
  [0.1, 0],
  [0.15, 0],
  [0.2, 0],
  [0.25, 0],
  [0.3, 0],
  [0.35, 0],
  [0.4, 1],
  [0.45, 1],
  [0.5, 2],
  [0.55, 3],
  [0.6, 4],
  [0.65, 5],
  [0.7, 6],
  [0.75, 7],
  [0.8, 8],
  [0.85, 8],
  [0.9, 9],
  [0.95, 10],
  [1.0, 11]],
 [[0.05, 0],
  [0.1, 1],
  [0.14, 1],
  [0.19, 1],
  [0.24, 1],
  [0.29, 1],
  [0.33, 1],
  [0.38, 1],
  [0.43, 1],
  [0.48, 1],
  [0.52, 2],
  [0.57, 3],
  [0.62, 4],
  [0.67, 5],
  [0.71, 5],
  [0.76, 5],
  [0.81, 6],
  [0.86, 7],
  [0.9, 8],
  [0.95, 8],
  [1.0, 9]],
 [[0.05, 0],
  [0.11, 0],
  [0.16, 0],
  [0.21, 0],
  [0.26, 1],
  [0.32, 1],
  [0.37, 1],
  [0.42, 1],
  [0.47, 2],
  [0.53, 3],
  [0.58, 3],
  [0.63, 3],
  [0.68, 4],
  [0.74, 5],
  [0.79, 5],
  [0.84, 6],
  [0.89, 7],
  [0.95, 8],
  [1.0, 9]]]

Upvotes: 1

Pablo Martinez
Pablo Martinez

Reputation: 459

Use the following code, and you will have a list of lists in the lists object.

with open(path, 'r') as f:
  lists = []
  current = [[], []]
  card = 0
  for line in f.readlines():
    idx, x, y = line.split()
    if idx < card:
      card = idx
      lists.append(current)
      continue

    card = idx
    current[0].append(int(x * 100) / 100)
    current[1].append(y)

Upvotes: 1

Related Questions