Reputation: 371
I am struggling to create a text file from another text file.
My text file is:
0.0 99.13 0.11
0.5 19.67 0.59
0.5 22.23 1.22
1.0 9.67 0.08
and I would like to create a text file such as:
0.0 99.13 0.11
0.5 19.67 0.59
1.0 9.67 0.08
or
0.0 99.13 0.11
0.5 22.23 1.22
1.0 9.67 0.08
Generally, every time there would be a duplicate value in the first column of my file I would like to create a file with just one of the duplicates and a value of the chosen line.
My code so far is:
def createFile(file):
with open(file, 'r') as fh:
data = fh.read()
for row in data.splitlines():
column = row.split()
print column
>>>
['0.0', '99.13', '0.11']
['0.5', '19.67', '0.59']
['0.5', '22.23', '1.22']
['1.0', '9.67', '0.08']
which would let my play with the indexes - maybe checking if column[0] is repeated and then printing the line? or would creating a dictionary be easier?
Cheers, Kate
Upvotes: 3
Views: 132
Reputation: 6828
If the duplicates are grouped in order, use itertools.groupby
:
from itertools import groupby
data = """0.0 99.13 0.11
0.5 19.67 0.59
0.5 22.23 1.22
1.0 9.67 0.08""".split('\n')
result = [list(j) for i, j in groupby(data, lambda x: x.split(' ', 1)[0])]
files_num = 0
for e in result:
files_num = max(files_num, len(e))
for i in range(files_num):
with open('{}.txt'.format(i), 'w+') as f:
for line in result:
min_index = min(i, len(line)-1)
f.write('{}\n'.format(line[min_index]))
0.txt:
0.0 99.13 0.11
0.5 19.67 0.59
1.0 9.67 0.08
1.txt:
0.0 99.13 0.11
0.5 22.23 1.22
1.0 9.67 0.08
Otherwise, if they are not grouped in order, you can use a collections.OrderedDict
this way (like 1_CR suggested, but with some changes):
from collections import OrderedDict
data = """0.0 99.13 0.11
0.5 19.67 0.59
1.0 9.67 0.08
0.5 22.23 1.22""".split('\n')
d = OrderedDict()
for line in data:
split = line.split(' ', 1)
d.setdefault(split[0], []).extend(split[1:])
print(d)
Output:
OrderedDict([ ('0.0', ['99.13 0.11']),
('0.5', ['19.67 0.59', '22.23 1.22']),
('1.0', ['9.67 0.08']) ])
Upvotes: 2
Reputation: 23364
Another option
from StringIO import StringIO
from collections import OrderedDict
s = '''\
0.0 99.13 0.11
0.5 19.67 0.59
0.5 22.23 1.22
1.0 9.67 0.08
'''
f = StringIO(s)
d = OrderedDict()
for line in f:
fields = line.split()
d[fields[0]] = fields[1:]
for key in d:
print key, ' '.join(d[key])
0.0 99.13 0.11
0.5 22.23 1.22
1.0 9.67 0.08
Upvotes: 0