user739807
user739807

Reputation: 855

best data-structure that i can use to store tabular data?

I have a list of frames >10,000 and a list of sources (Coordinates), I want find which source exists on which frame. Each frame has a filter attribute, and it is expected that source can be found on one or more frames of the same filter. Is this is the case, i want to record only one one occurance of such an event.

Eventually run a script easily to generate a web-table. Below is an example of tables i want to generate.

Source | filter_1 |filter_2 |filter_3 |filter_4 |
-------------------------------------------------
1      | image1   | image 2 | image 3 | image 4 |
2      | image5   | image 6 | image 7 | image 8 |

this it my code

webtable =[]
for frame in frames:
  for x, y in sources:
    if x_y_on_frame():
       webtable.append(
       {
       'source':(x,y), 
       'ifilter':frame.filter.name, 
       'ifile':frame.filename,
       'pFile':frame.pngfile,
       'fFile':frame.fitsfile,
       }
       )

I need to check if a combination of a source i.e. (x,y) and ifilter already exist in webtable before i append the record. What is the best data structure to implement this?

Upvotes: 0

Views: 453

Answers (3)

William McVey
William McVey

Reputation: 552

Since you have a static set of keys for your data dictionaries, a namedtuple from the collections module would actually be better than the anonymous dictionary. Namedtuples have a lower overhead than dictionaries (since the duplicate keys don't have to be stored per item), but have the convenience of named access.

You could define your namedtuple similar to:

from collections import namedtuple
Row = namedtuple('Row', 'iFile pFile fFile')

Then, rather creating a dictionary of the form:

{ 'iFile': foo, 'pFile': bar, ...}

you would create an instance of your namedtuple you got back from the factoryfunction:

Row(iFile=foo, pFile=bar, ...)

If you need to access an attached value, you just access it as an instance variable:

foo = Row(iFile="somevalue", pfile="different_value", fFile="yet another value")
if foo.iFile == "whatever":
   ....

Upvotes: 0

Cory Dolphin
Cory Dolphin

Reputation: 2670

I need to check if a combination of a source i.e. (x,y) and ifilter already exist in webtable before i append the record. What is the best data structure to implement this?

Assuming that x,y and ifilter can all be represented as strings, or integers (or other immutable types), it would actually be even easier to simply store your information in a dictionary where a tuple of (x,y,ifilter) is the key, this would require a minimal amount of code, and still be very efficient:

webtable ={}
for frame in frames:
  for x, y in sources:
    if x_y_on_frame():
        keyTuple = (x,y,frame.filter.name)
        if not keyTuple in webtable:
            webtable[keyTuple] = {
            'ifile':frame.filename,
            'pFile':frame.pngfile,
            'fFile':frame.fitsfile,
            }

Upvotes: 1

Mariusz Jamro
Mariusz Jamro

Reputation: 31692

Python dict would be just fine. If there is an entry with given ifilter, x and y - continue to next item in sources:

webtable = []
webtable_cache = {}

for frame in frames:
  for x, y in sources:
    if x_y_on_frame():

        ifilter = frame.filter.name

        if ifilter in webtable_cache
           if y in webtable_cache[ifilter]:
                if x in webtable_cache[ifilter][y]:
                    continue     # already in webtable
                else:
                    webtable_cache[ifilter][y][x] = True
            else:
                webtable_cache[ifilter][y] = {x: True}
        else:
            webtable_cache[ifilter] = {y: {x: True}}

        webtable.append(
               {
               'source':(x,y), 
               'ifilter':ifilter, 
               'ifile':frame.filename,
               'pFile':frame.pngfile,
               'fFile':frame.fitsfile,
               }
           )

Upvotes: 0

Related Questions