yannickhau
yannickhau

Reputation: 405

Shuffle List of lists containing string

For my bachelorthesis I need to shuffle sentences in a textcorpus.

Data looks like this:

[
['1', '$', '0-', '$', '10', 'Culture', ':', 'Play', 'Your', 'Way', 'to', 'China', '.'], 
['2', '02.59', 'The', 'press', 'are', 'being', 'kept', 'well', 'away', 'as', 'the', 'couple', 'meet', '21', 'local', 'dignitaries', ',', 'reports', 'the', 'BBC', "'s", 'Peter', 'Hunt', ':', 'An', 'official', 'insisted', 'all', 'journalists', 'stand', 'inside', 'a', 'pen', 'at', 'a', 'deserted', 'airport', 'runway', '.'], 
['3', '€0.25', ')', 'plus', 'PLN', '1', 'booking', 'fee', 'and', 'comfortable', 'vehicles', '.']
, 
['4', '0', "'", '6', "''", 'x', '7', "'", '6', "''", '(', '0.17m', 'x', '2.31m', ')', 'Double', 'glazed', 'window', 'to', 'the', 'side', ',', 'heated', 'towel', 'rail', ',', 'ceramic', 'floor', 'tiles', ',', 'fully', 'tiled', 'walls', ',', 'spotlights', ',', 'low', 'level', 'W/C', ',', 'shower', 'unit', 'with', 'glass', 'surround', ',', 'sink', 'with', 'mixer', 'tap', ',', 'extractor', 'fan', '.'], 
['5', '07:00', 'am', '-', 'Mon', ',', 'September', '19', '2011', 'I', 'already', 'have', 'the', 'Keystone', 'pipeline', 'running', 'through', 'my', 'properiety', 'this', 'is', 'Keystone', 'XL', 'or', 'extra', 'large', '.']
]

I have tried import shuffle from random and also numpy.random.shuffle, but all my minimal examples only work with lists of ints, not with lists of strings.

Here you can see my latter try

import numpy as np
raw = open('eng_news_2016_300K-sentences.txt').read()
eng3Cor = [word_tokenize(sent) for sent in sent_tokenize(raw)]
eng3Cor = eng3Cor[:5]
del raw
y = np.array([np.array(xi, dtype=object) for xi in eng3Cor], dtype=object)`

Any advice how to do this?

EDIT: eng3Cor is the list of lists.

Upvotes: 0

Views: 37

Answers (1)

Barmar
Barmar

Reputation: 782682

random.shuffle() works with lists of any data type, not just lists of ints.

This will shuffle the words in each sentence:

import random

for sent in eng3Cor:
    random.shuffle(sent)

This will just shuffle the order of the sentences:

import random

random.shuffle(eng3Cor)

Upvotes: 3

Related Questions