Zy Taga
Zy Taga

Reputation: 89

Data Structure Option

I'm wondering what appropriate data structure I'm going to use to store information about chemical elements that I have in a text file. My program should read and process input from the user. If the user enters an integer then it program should display the symbol and name of the element with the number of protons entered. If the user enters a string then my program should display the number of protons for the element with that name or symbol.

The text file is formatted as below

# element.txt

1,H,Hydrogen
2,He,Helium
3,Li,Lithium
4,Be,Beryllium
...

I thought of dictionary but figured that mapping a string to a list can be tricky as my program would respond based on whether the user provides an integer or a string.

Upvotes: 0

Views: 69

Answers (2)

Stef
Stef

Reputation: 15525

You shouldn't be worried about the "performance" of looking for an element:

  • There are no more than 200 elements, which is a small number for a computer;
  • Since the program interacts with a human user, the human will be orders of magnitude slower than the computer anyway.

Option 1: pandas.DataFrame

Hence I suggest a simple pandas DataFrame:

import pandas as pd

df = pd.read_csv('element.txt')
df.columns = ['Number', 'Symbol', 'Name']

def get_column_and_key(s):
  s = s.strip()
  try:
    k = int(s)
    return 'Number', k
  except ValueError:
    if len(s) <= 2:
      return 'Symbol', s
    else:
      return 'Name', s

def find_element(s):
  column, key = get_column_and_key(s)
  return df[df[column] == key]

def play():
  keep_going = True
  while keep_going:
    s = input('>>>> ')
    if s[0] == 'q':
      keep_going = False
    else:
      print(find_element(s))

if __name__ == '__main__':
  play()

See also:

Option 2: three redundant dicts

One of python's most used data structures is dict. Here we have three different possible keys, so we'll use three dict.

import csv

with open('element.txt', 'r') as f:
  data = csv.reader(f)
  elements_by_num = {}
  elements_by_symbol = {}
  elements_by_name = {}
  for row in data:
    num, symbol, name = int(row[0]), row[1], row[2]
    elements_by_num[num] = num, symbol, name
    elements_by_symbol[symbol] = num, symbol, name
    elements_by_name[name] = num, symbol, name

def get_dict_and_key(s):
  s = s.strip()
  try:
    k = int(s)
    return elements_by_num, k
  except ValueError:
    if len(s) <= 2:
      return elements_by_symbol, s
    else:
      return elements_by_name, s

def find_element(s):
  d, key = get_dict_and_key(s)
  return d[key]

def play():
  keep_going = True
  while keep_going:
    s = input('>>>> ')
    if s[0] == 'q':
      keep_going = False
    else:
      print(find_element(s))

if __name__ == '__main__':
  play()

Upvotes: 1

HelixAchaos
HelixAchaos

Reputation: 131

You are right that it is tricky. However, I suggest you just make three dictionaries. You certainly can just store the data in a 2d list, but that'd be way harder to make and access than using three dicts. If you desire, you can join the three dicts into one. I personally wouldn't, but the final choice is always up to you.

weight = {1: ("H", "Hydrogen"), 2: ...}
symbol = {"H": (1, "Hydrogen"), "He": ...}
name = {"Hydrogen": (1, "H"), "Helium": ...}

If you want to get into databases and some QLs, I suggest looking into sqlite3. It's a classic, thus it's well documented.

Upvotes: 0

Related Questions