Reputation: 845
I have a list which contains 30 both positive and negative numbers(in actual there are 35040 values)
dev = [-21,-22,-33,-55,-454,65,48,-516,614,6,2,-64,-64,-87,6,45,87,15,11,3,-34,-6,-68,-959,-653,24,658,68,9,-2181]
Now I have written a program that can calculate consecutive positive or negative number by making 3 equal sets of the "dev" list. Set size is 10, so there will be 3 sets which indexes as: 0-9, 10-19, 20-29. The program is:
dev = [
-21, -22, -33, -55, -454, 65, 48, -516, 614, 6,
2, -64, -64, -87, 6, 45, 87, 15, 11, 3,
-34, -6, -68, -959, -653, 24, 658, 68, 9, -2181
]
nths = 10
sequential_limit = 3
sequential_count = sequential_finds = 0
indexer = sequential_limit - 1
sequential_list = [0 for _ in range(indexer)]
skip = 0
for index, num in enumerate(dev[indexer:], indexer):
result = 0
if index % nths == 0:
sequential_count = sequential_finds = 0
skip = indexer
if skip:
skip -= 1
else:
negative = sum(1 for next_num in dev[index - indexer:index + 1] if next_num < 0)
positive = sum(1 for next_num in dev[index - indexer:index + 1] if next_num >= 0)
if sequential_limit in (positive, negative):
sequential_finds += 1
sequential_count = 0
skip = indexer
result = sequential_finds
sequential_list.append(result)
print(sequential_list)
output:
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 3, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0]
Now what I want is that if positive or negative still occuring even after 3 consecutive term then "sequential_count" or you can say the main counter, should not start counting from the next index. It should start counting from the same last index where the "sequential_count" incremented. Afte this correct desired output will be:
[0, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 3, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0]
Upvotes: 0
Views: 88
Reputation: 5286
First of all, variable names should be descriptive. For example: dataset
despite being generic is better that dev
(more context would be needed for a better variable name that is not as generic as dataset
), size
or chunk_size
is more descriptive that nths
, ... I will use what IMHO are better variable names in my solution but will keep your provided one in comments.
0
is not a positive number, nor a negative one. I will consider 0 as a special sign, but the changes needed to include it with the positive numbers should be easy to do.
Try to split functionality into functions, it will make it much more readable.
Also, you are creating multiple lists every time to hold the values and using sum
and if
instead of having a circular buffer with the last values. A circular buffer in python is implemented with a collections.deque
item with a maxlen set. This way when you insert the 4th value the 1st will be removed and so on. collections.deque.maxlen
is read only, it needs to be provided at construction time, it can not be changed afterwards.
I added 3 modes, mode 0
doesn't skip any item, mode 1
skips values that are in other sequential equal-sign groups as you had previously, and mode 2
skips them except for the last in each member, as you requested. If you only want one mode, you can replace the whole if ... elif ... elif ...
block for the lines inside the mode you want and also remove the mode
variable declaration at the top.
from collections import deque
from typing import Sequence
dataset = [ # dev
-21, -22, -33, -55, -454, 65, 48, -516, 614, 6,
2, -64, -64, -87, 6, 45, 87, 15, 11, 3,
-34, -6, -68, -959, -653, 24, 658, 68, 9, -2181
]
chunk_size = 10 # nths
sequential_limit = 3
mode = 2
def equal_sign(a: int, b:int) -> bool:
'''equal_sign returns true if both input parameters have the same sign'''
if a == 0 and b == 0:
return True
if a > 0 and b > 0:
return True
if a < 0 and b < 0:
return True
return False
def same_sign(items: Sequence[int]) -> bool:
'''same_sign returns true if every item shares the same sign'''
# Special case for lengths 0 or 1 as they will never have different signs
if len(items) < 2:
return True
# Check every item with the first one
for i in range(1, len(items)):
# If any of the items is different we can return False
if not equal_sign(items[0], items[i]):
return False
# If we reach here they all have the same sign
return True
# The outter loop will be in charge of splitting the different chunks in the dataset.
result = []
chunk_start = 0
# values is a circular buffer that will hold the previous and current values
values = deque(maxlen=sequential_limit)
while chunk_start < len(dataset):
# Chunks end at the specified size or the maximum dataset length, preventing IndexErrors
chunk_end = min(chunk_start + chunk_size, len(dataset))
sequential_finds = 0
values.clear()
for value in dataset[chunk_start:chunk_end]:
# Insert our current value in the circular buffer
values.append(value)
# If we don't have enough values skip this iteration
if len(values) != values.maxlen:
result.append(0)
continue
# Lets check if there is a sequential count
if same_sign(values):
sequential_finds += 1
result.append(sequential_finds)
if mode == 0:
pass
elif mode == 1:
values.clear()
elif mode == 2:
values.clear()
values.append(value)
else:
result.append(0)
# Update the chunk start position for the next iteration
chunk_start = chunk_end
mode = 0
:
[0, 0, 1, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 2, 3, 4, 5, 0, 0, 1, 2, 3, 0, 0, 4, 5, 0]
mode = 1
:
[0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 3, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0]
mode = 2
:
[0, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 2, 0, 3, 0, 0, 0, 1, 0, 2, 0, 0, 3, 0, 0]
Upvotes: 1