gran_profaci
gran_profaci

Reputation: 8483

Getting the first elements of a dictionary without using a loop

I have this piece of code and was wondering if there was any inbuilt way to do it faster?

Words has a simple tokenized string input.

freq_unigrams = nltk.FreqDist(words)
unigram_list = []

count = 0
for x in freq_unigrams.keys():
    unigram_list.append(x)
    count+=1
    if count >= 1000:
        break

Upvotes: 0

Views: 124

Answers (5)

antak
antak

Reputation: 20869

This is theoretically more efficient:

import itertools
unigram_list = list(itertools.islice(freq_unigrams.iterkeys(), 1000))

...than working off freq_unigrams.keys(), because you're only interested in the top 1000, and not the remaining x, which the using freq_unigrams.keys() will also need to populate in the intermediate list.

Upvotes: 1

jfs
jfs

Reputation: 414915

If your intent is to get the top 1000 most frequent words in the words list you could try:

import collections

# get top words and their frequencies
most_common = collections.Counter(words).most_common(1000)

Upvotes: 1

Dan D.
Dan D.

Reputation: 74685

I suggest:

unigram_list = freq_unigrams.keys()
unigram_list[:] = unigram_list[:1000]

This would not make the copy that: unigram_list = freq_unigrams.keys()[:1000] does.

Although this might be better with iterators:

from itertools import islice
unigram_list[:] = islice(freq_unigrams.iterkeys(),1000)

Upvotes: 1

erewok
erewok

Reputation: 7845

**a little late...

To take the first 1000 keys in your dictionary and assign them to a new list:

unigram_list = freq_unigrams.keys()[:1000]

Upvotes: 0

Benjamin Hodgson
Benjamin Hodgson

Reputation: 44674

Does freq_unigrams.keys() return a list? If so, how about the following:

unigram_list = freq_unigrams.keys()[:1000]

This gives you a list containing the first 1000 elements of freq_unigrams.keys(), with no looping.

Upvotes: 4

Related Questions