Reputation: 4807
I have data in .csv
file called 'Max.csv':
Valid Date MAX
1/1/1995 51
1/2/1995 45
1/3/1995 48
1/4/1995 45
Another csv called 'Min.csv' looks like:
Valid Date MIN
1/2/1995 33
1/4/1995 31
1/5/1995 30
1/6/1995 39
I want two generate two dictionaries or any other suggested data structure so that I can have two separate variables Max and Min in python respectively as:
Valid Date MAX
1/2/1995 45
1/4/1995 45
Valid Date MIN
1/2/1995 33
1/4/1995 31
i.e. select the elements from Max and Min so that only the common elements are output.
I am thinking about using numpy.intersect1d, but that means I have to separately compare the Max and Min first on date column, find the index of common dates and then grab the second columns for Max and Min. This appears too complicated and I feel there are smarter ways to intersect two curves Max and Min.
Upvotes: 1
Views: 1019
Reputation: 10759
You mention that:
I have to separately compare the Max and Min first on date column, find the index of common dates and then grab the second columns for Max and Min. This appears too complicated...
Indeed this is fundamentally what you need to do, one way or the other; but using the numpy_indexed package (disclaimer: I am its author), this isn't complicated in the slightest:
import numpy_indexed as npi
common_dates = npi.intersection(min_dates, max_dates)
print(max_values[npi.indices(max_dates, common_dates)])
print(min_values[npi.indices(min_dates, common_dates)])
Note that this solution is fully vectorized (contains no loops on the python-level), and as such is bound to be much faster than the currently accepted answer.
Note2: this is assuming the date columns are unique; if not, you should replace 'npi.indices' with 'npi.in_'
Upvotes: 2
Reputation: 669
The set()
builtin must be enough as follows:
>>> max = {"1/1/1995":"51", "1/2/1995":"45", "1/3/1995":"48", "1/4/1995":"45"}
>>> min = {"1/2/1995":"33", "1/4/1995":"31", "1/5/1995":"30", "1/6/1995":"39"}
>>> a = set(max)
>>> b = set(min)
>>> {x:max[x] for x in a.intersection(b)}
{'1/4/1995': '45', '1/2/1995': '45'}
>>> {x:min[x] for x in a.intersection(b)}
{'1/2/1995': '33', '1/4/1995': '31'}
Upvotes: 1