Reputation: 21
List element format: (x0, y0, x1, y1, "word", block_no, line_no, word_no)
given = [
(518.1566162109375, 381.6667175292969, 537.3801879882812, 391.70867919921875, 'cost', 19, 0, 11),
(542.1559448242188, 381.6667175292969, 556.5796508789062, 391.70867919921875, 'and', 19, 0, 12),
(81.36001586914062, 390.6634826660156, 124.58306121826172, 400.7054443359375, 'inventory', 19, 1, 0),
(129.35882568359375, 390.6634826660156, 167.78199768066406, 400.7054443359375, 'control,', 19, 1, 1)
]
I need to group by "y1" with the same values and form as given below:
required = [
[
(518.1566162109375, 381.6667175292969, 537.3801879882812, 391.70867919921875, 'cost', 19, 0, 11),
(542.1559448242188, 381.6667175292969, 556.5796508789062, 391.70867919921875, 'and', 19, 0, 12)
],
[
(81.36001586914062, 390.6634826660156, 124.58306121826172, 400.7054443359375, 'inventory', 19, 1, 0),
(129.35882568359375, 390.6634826660156, 167.78199768066406, 400.7054443359375, 'control,', 19, 1, 1)
]
]
Please suggest me some best way to achieve it.
Upvotes: 0
Views: 58
Reputation: 13339
using itertools
:
import itertools
byloc = lambda x: x[3]
new_list = [list(v) for k,v in itertools.groupby(given, key=byloc)]
new_list
[[(518.1566162109375,
381.6667175292969,
537.3801879882812,
391.70867919921875,
'cost',
19,
0,
11),
(542.1559448242188,
381.6667175292969,
556.5796508789062,
391.70867919921875,
'and',
19,
0,
12)],
[(81.36001586914062,
390.6634826660156,
124.58306121826172,
400.7054443359375,
'inventory',
19,
1,
0),
(129.35882568359375,
390.6634826660156,
167.78199768066406,
400.7054443359375,
'control,',
19,
1,
1)]]
Upvotes: 0
Reputation: 21275
With itertools.groupby
& operator.itemgettter
:
from itertools import groupby
from operator import itemgetter
given = [
(518.1566162109375, 381.6667175292969, 537.3801879882812, 391.70867919921875, 'cost', 19, 0, 11),
(542.1559448242188, 381.6667175292969, 556.5796508789062, 391.70867919921875, 'and', 19, 0, 12),
(81.36001586914062, 390.6634826660156, 124.58306121826172, 400.7054443359375, 'inventory', 19, 1, 0),
(129.35882568359375, 390.6634826660156, 167.78199768066406, 400.7054443359375, 'control,', 19, 1, 1)
]
grouped_by_y1 = [list(g) for _, g in groupby(given, key=itemgetter(3))]
print(grouped_by_y1)
Output:
[
[(518.1566162109375, 381.6667175292969, 537.3801879882812, 391.70867919921875, 'cost', 19, 0, 11), (542.1559448242188, 381.6667175292969, 556.5796508789062, 391.70867919921875, 'and', 19, 0, 12)],
[(81.36001586914062, 390.6634826660156, 124.58306121826172, 400.7054443359375, 'inventory', 19, 1, 0), (129.35882568359375, 390.6634826660156, 167.78199768066406, 400.7054443359375, 'control,', 19, 1, 1)]
]
Upvotes: 1