Reputation: 11
I have 'rh_irr’ which is an array of size 744*171*162
where 171 is latitude, 162 is longitude and 744 is time dimension. I also have arrays lat and lon both have size of 171*162 and contains corresponding latitude and longitudes of the data. I need to extract data within the bounding box -115.3 to -115.5 longitude, and 32.5 to 33.1 latitude.
import numpy as np
# Define the latitude and longitude ranges
lat_min, lat_max = 32.5, 33.1
lon_min, lon_max = -115.5, -115.3
# Find the indices of latitudes and longitudes within the specified range
lat_indices = np.where((lat >= lat_min) & (lat <= lat_max))[0]
lon_indices = np.where((lon >= lon_min) & (lon <= lon_max))[1]
# Extract the data within the specified range
extracted_data = rh_irr[:, lat_indices, :][:, :, lon_indices]
# Print the shape of the extracted data
print("Shape of extracted data:", extracted_data.shape)
The above code executes and hopefully it will do what I want to do but is crashing. How can we make it more memory efficient? Please suggest if there is another better way.
Upvotes: 0
Views: 159
Reputation: 5425
The np.where()
function uses additional memory for storing indices where a condition is True
but it shouldn't impact on the memory usage significantly. If you apply boolean masks directly to the rh_irr
array to extract the subset, it will avoid unnecessary memory allocations. You can use the memory-profiler module to monitor the memory consumption.
import numpy as np
# Define the latitude and longitude ranges
lat_min, lat_max = 32.5, 33.1
lon_min, lon_max = -115.5, -115.3
# Create boolean masks for latitude and longitude
lat_mask = (lat >= min_lat) & (lat <= max_lat)
lon_mask = (lon >= min_lon) & (lon <= max_lon)
# Apply boolean masks directly on the array
selected_data = rh_irr[:, lat_mask, :][:, :, lon_mask]
# Check the shape of selected_data
print("Shape of selected_data:", selected_data.shape)
Upvotes: 0