StrayinGallifreyan
StrayinGallifreyan

Reputation: 13

Python way to find values in one list that fall in-between the values of another list

Working with some research data and trying to find a good Pythonic way to determine if the values in one list fall in between the values of another list. Each line of each data file contains a sequential list of years that indicate a year in which a significant value was found on the left, and another sequential list of years that indicate a year where a tied value was found on the right, separated by a pipe character and spaces between each year.

Example: 1950 1955 1960 1977|1957 1958 1959 1966 1970 1975 1980 2015

So, in the example above the year 1950 has no tie, but the year 1955 was tied by years 1957, 1958 and 1959. The year 1960 was tied by 1966 and 1970. The year 1977 was tied by 1980 and 2015.

These lists are being dynamically created based on the evaluation of changing data, so on any given iteration when processing this data, the list on the left or right side of the pipe character may have more or less items.

When processing these lists, the years on the left are given a value of one, but the years on the right must be assigned a weighted value based on how often they occur as a tie with the year in the list on the left side of the pipe character.

The weight of value assigned to the tied years on the right need to be decremented in a reciprocal fashion. For example, the year 1957 would be assigned a weighted value of 0.5, the year 1958 would be given a weighted value of 0.33, and the year 1959 would only be valued at 0.25. Then, the next range of tied years would be greater than 1960 and less than 1977, and starting with 1966 that would in turn start again with a weighted value of 0.5.

Looking on Stack Overflow and found something similar to what I am trying to do, but there is no "between()" function in Python:

Finding values in one vector that are between the values in another vector

Is there a Pythonic way to make such a comparison, and dynamically assign values to the tied years on the right based on how they fall in-between the significant years on the left just using Python 2.7.5 and no additional add-on libraries?

Upvotes: 1

Views: 113

Answers (1)

jpp
jpp

Reputation: 164673

I believe your problem can be decomposed into 2 steps:

  • Calculate ranges for each year on the left.
  • Calculate weights for each year on the right.

Python's range built-in and list / dictionary comprehensions should be sufficient.

Below is an example implementation. I have included intermediary output to help you understand what is happening at each stage.

from itertools import zip_longest

mystr = '1950 1955 1960 1977|1957 1958 1959 1966 1970 1975 1980 2015'

lsts = [list(map(int, x.split())) for x in mystr.split('|')]

# [[1950, 1955, 1960, 1977], [1957, 1958, 1959, 1966, 1970, 1975, 1980, 2015]]

def ranger(x1, x2, lst):
    return [i for i in lst if i in range(x1, x2)]

d = {i: ranger(i, j, lsts[1]) for i, j in \
     zip_longest(lsts[0], lsts[0][1:], fillvalue=lsts[1][-1]+1)}

# {1950: [], 1955: [1957, 1958, 1959], 1960: [1966, 1970, 1975], 1977: [1980, 2015]}

w = {k: [1/(i+2) for i in range(len(v))] if v else [] for k, v in d.items()}

# {1950: [],
#  1955: [0.5, 0.3333333333333333, 0.25],
#  1960: [0.5, 0.3333333333333333, 0.25],
#  1977: [0.5, 0.3333333333333333]}

Upvotes: 1

Related Questions