Reputation: 2117
I am using Python to make lists. Should be easy! I don't know why I'm struggling so much with this.
I have some data that I am counting up by date. There is a date column like this:
Created on
5/1/2015
5/1/2015
6/1/2015
6/1/2015
7/1/2015
8/1/2015
8/1/2015
8/1/2015
In this case, there would be 2 Units created in May, 2 Units in June, 1 Unit in July, and 3 Units in August.
I want to reflect that in a list that starts in April ([April counts, May counts, June counts, etc...]):
NumberofUnits = [0, 2, 3, 1, 3, 0, 0, 0, 0, 0, 0, 0]
I have a nice list of months
monthnumbers
Out[69]: [8, 5, 6, 7]
I also have a list with the unitcounts = [2, 3, 1, 3]
I got this using value_counts.
So it's a matter of making a list of zeroes and replacing parts with the unitcount list, right?
For some reason all of my tries are either not making a list or making a list with one zero in it.
NumberofUnits = [0]*12
for i in range(0,len(monthnumbers)):
if **monthnumbers[i] == (i+4):** **This part is wrong**
NumberofUnits.append(unitcounts[i])
s = slice(0,i+1)
I also tried
NumberofUnits = []
for i in range(0, 12):
if len(NumberofUnits) > i:
unitcounts[i:]+unitcounts[:i]
NumberofUnits.append(unitcounts[i])
s = slice(0,i+1)
else:
unitcounts.append(0)
But this doesn't account for the fact that in this round my data starts with May, so I need a zero in the first slot.
Upvotes: 3
Views: 136
Reputation: 46759
The following is a more "old school" approach. It assumes your dates are in the first column of your CSV file, i.e. cols[0]
. It validates the input dates, it will raise a ValueError exception if a date is not valid or if it is older than the last one. It will also cope if your input skips one or more months.
import csv
from datetime import datetime
with open("input.csv", "r") as f_input:
csv_input = csv.reader(f_input)
header = next(csv_input)
last_date = datetime(year=2015, month=4, day=1)
cur_total = 0
units_by_month = []
for cols in csv_input:
cur_date = datetime.strptime(cols[0], "%m/%d/%Y")
if cur_date.month == last_date.month:
cur_total += 1
elif cur_date < last_date:
raise ValueError, "Date is older"
else:
extra_months = ((cur_date.month + 12 - last_date.month) if cur_date.year - last_date.year else (cur_date.month - last_date.month)) - 1
units_by_month.extend([cur_total] + ([0] * extra_months))
last_date = cur_date
cur_total = 1
units_by_month.extend([cur_total] + [0] * ((8-len(units_by_month)) if len(units_by_month) < 9 else 0))
print units_by_month
So for your input it will give the following output:
[0, 2, 2, 1, 3, 0, 0, 0, 0, 0]
If one extra entry was added 3/1/2016
, the following would be displayed:
[0, 2, 2, 1, 3, 0, 0, 0, 0, 0, 0, 1]
Upvotes: 0
Reputation: 60065
Why not just:
counter = [0]*12
for m in monthnumbers:
counter[(m - 4) % 12] += 1
print counter
Upvotes: 1
Reputation: 330193
You can count entries using collections.counter
from collections import Counter
lines = ['5/1/2015', '5/1/2015', ..., '8/1/2015']
month_numbers = [int(line.split("/")[0]) for line in lines]
cnt = Counter(month_numbers)
If you already have counts you can replace above with
from collections import defaultdict
cnt = defaultdict(int, zip(monthnumbers, unitcounts))
and simply map to entries with (month_number - offset) mod 12:
[x[1] for x in sorted([((i - offset) % 12, cnt[i]) for i in range(1, 13)])]
Upvotes: 1
Reputation: 180441
If the data is coming from a file or any iterable you can use an OrderedDict
, creating the keys in order starting from 4/april
, then increment the count for each month you encounter the finally print the list of values at the end which will be in the required order:
from collections import OrderedDict
od = OrderedDict((i % 12 or 12, 0) for i in range(4, 16))
# -> OrderedDict([(4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0), (10, 0), (11, 0), (12, 0), (1, 0), (2, 0), (3, 0)])
with open("in.txt") as f:
for line in f:
mn = int(line.split("/",1)[0])
od.setdefault(mn, 0)
od[mn] += 1
print(list(od.values()))
[0, 2, 2, 1, 3, 0, 0, 0, 0, 0, 0, 0]
Unless you do the logic like above, associating the data when you actually parse it then it is going to be a lot harder figure out what count is for which month. Creating the association straight away is a much simpler approach.
If you have a list, tuple etc.. of values the logic is exactly the same:
for dte in list_of_dates:
mn = int(dte.split("/",1)[0])
od.setdefault(mn, 0)
od[mn] += 1
Upvotes: 1