gonnaflynow720
gonnaflynow720

Reputation: 29

Python: Splitting numbers from text and then summing them

All I have a text file formatted like below which I am bringing into Python:

    hammer#9.95
    saw#20.15
    shovel#35.40

Ultimately I want to develop a dynamic query that allows me to remove the '#' symbol and replace with a '$' symbol, and then add up the values within the text file/count the number of items within. I came up with this through some trial and error, but it isn't dynamic to handle changes in the text file:

 # display header line for items list
print('{0: <10}'.format('Item'), '{0: >17}'.format('Cost'), sep = '' )

# add your remaining code below
with open('invoice.txt','rt') as infile:
    for line in infile:
        print("{:<21} {}".format(line.strip().split('#')[0],"$"+line.strip().split("#")[1]))

print(' ')
str1 = 'Total cost\t' +'      ' + '$65.50'
print(str1)

str2 = 'Number of tools\t' + '           ' +'3'
print(str2)

Any suggestions? Thanks ahead of time for reading.

Upvotes: 3

Views: 1615

Answers (5)

wiesion
wiesion

Reputation: 2455

Since it is long due that i should refresh my Python skills i had some fun with your question and came up with a parser class:

import re
from contextlib import contextmanager


class Parser(object):

    def __init__(self, file_path, regex):
        self.file_path = file_path
        self.pattern = re.compile(regex, flags=re.LOCALE | re.IGNORECASE | re.UNICODE)
        self.values = []
        self.parse()

    @contextmanager
    def read_lines(self):
        try:
            with open(self.file_path, "r", encoding="utf-8") as f:
                yield f.readlines()
        except FileNotFoundError:
            print("Couldn't open file: ", self.file_path)

    def parse_line(self, line):
        try:
            return self.pattern.match(line).groupdict()
        except AttributeError:
            return None

    def parse(self):
        with self.read_lines() as lines:
            self.values = [value for value in map(self.parse_line, lines) if value]

    def get_values(self, converters=dict()):
        if len(converters) is 0:
            return self.values
        new_values = []
        for value in self.values:
            new_value = {}
            for key in value:
                if key in converters:
                    new_value[key] = converters[key](value[key])
                else:
                    new_value[key] = value[key]
            new_values.append(new_value)
        return new_values

This class takes a file path and a regex-like string, which is then compiled to a regex object. On instantiation it reads and parses the contents of the file while ignoring invalid lines (not matching the regex syntax like empty lines).

I also added a get_values method which can apply converters to named groups from the regex, see the example (it converts the named group price of every line into a float value):

parser = Parser(r"fully_qualified_file_path.txt", r".\s*(?P<name>[\w\s]+)\#(?P<price>[\d\.]+)")

total = 0
count = 0
for line in parser.get_values({'price': lambda x: float(x)}):
    total += line['price']
    count += 1
    print('Item: {name}, Price: ${price}'.format(**line))

print()
print('Item count:', count)
print('Total:', "${0}".format(total))

Result

Item: hammer, Price: $9.95
Item: saw, Price: $20.15
Item: shovel, Price: $35.4

Item count: 3
Total: $65.5

But coding fun aside, i suggest you try to get clean csv-like data and handle it properly through the csv class.

Upvotes: 0

Pedro Lobito
Pedro Lobito

Reputation: 99001

You can use:

total_price, total_products = 0, 0
for line in [open('invoice.txt').read().split("\n")]: 
    total_price += float(line.split("#")[1]); total_products += 1
print("Total Price\n${}".format(total_price))
print("Number of tools\n{}".format(total_products))

Total Price
$65.5
Number of tools
3

We have to cast the price (line.split("#")[1]), which is a string, to a float, otherwise we get a Type Error when we try to add it to total_price.

float(line.split("#")[1])

Upvotes: 0

jedwards
jedwards

Reputation: 30240

What about:

items = {}
with open("temp.txt") as f:
    for line in f:
        item,cost = line.split('#')
        cost = float(cost)
        items[item] = cost

Now, you have a dictionary, keyed by item "name" (so they need to be unique in your file, otherwise a dictionary isn't the best structure here) and each value is a float corresponding to the parsed cost.

# Print items and cost
print(items.items())
#> dict_items([('hammer', 9.95), ('saw', 20.15), ('shovel', 35.4)])

# Print Number of Items
print(len(items))
#> 3

# Print Total Cost (unformatted)
print(sum(items.values()))
#> 65.5

# Print Total Cost (formatted)
print("$%.02f" % sum(items.values()))
#> $65.50

There are some corner cases you may want to look at to make this solution more robust. For example if the item "name" includes a # sign (i.e. there is more than one # per line), the values aren't properly formatted to be parsed by float, etc.

Upvotes: 1

Dan
Dan

Reputation: 1884

prices = []
with open(...) as infile:
    for line in infile.readlines()
        price = line.split('#')[-1]
        prices.append(float(price))
result = sum(prices)

Upvotes: 1

YOLO
YOLO

Reputation: 21749

You can do it the following way:

d = ['hammer#9.95', 'saw#20.15', 'shovel#35.40']

## replace hash
values = []
items = set()
for line in d:
    line = line.replace('#', '$')
    values.append(line.split('$')[1])
    items.add(line.split('$')[0])

## sum values
sum(map(lambda x: float(x), values)) 
65.5

## count items
len(items)
3

Explanation:

  1. To count items, we've used a set to get unique count. If you want all, use a list instead.
  2. We've calculated sum by extracting the numbers from list by splitting on dollar sign.

Upvotes: 1

Related Questions