Reputation: 57
I’m having trouble using Ruby to pass some tests that make the array too big and return an error.
Solution.rb: failed to allocate memory (NoMemoryError)
I have failed to pass it twice.
The problem is about scheduling meetings. The method receives two parameters in order: a matrix with all the first days that investors can meet in the company, and a matrix with all the last days.
For example:
firstDay = [1,5,10]
lastDay = [4,10,10]
This shows that the first investor will be able to find himself between the days 1..4
, the second between the days 5..10
and the last one in 10..10
.
I need to return the largest number of investors that the company will serve. In this case, all of them can be attended to, the first one on day 1, the second one on day 5, and the last one on day 10.
So far, the code works normally, but with some hidden tests with at least 1000 investors, the error I mentioned earlier appears.
Is there a best practice in Ruby to handle this?
My current code is:
def countMeetings(firstDay, lastDay)
GC::Profiler.enable
GC::Profiler.clear
first = firstDay.sort.first
last = lastDay.sort.last
available = []
#Construct the available days for meetings
firstDay.each_with_index do |d, i|
available.push((firstDay[i]..lastDay[i]).to_a)
end
available = available.flatten.uniq.sort
investors = {}
attended_day = []
attended_investor = []
#Construct a list of investor based in their first and last days
firstDay.each_index do |i|
investors[i+1] = (firstDay[i]..lastDay[i]).to_a
end
for day in available
investors.each do |key, value|
next if attended_investor.include?(key)
if value.include?(day)
next if attended_day.include?(day)
attended_day.push(day)
attended_investor.push(key)
end
end
end
attended_investor.size
end
Using Lazy
as far as I could understand, I escaped the MemoryError
, but I started receiving a runtime error:
Your code was not executed on time. Allowed time: 10s
And my code look like this:
def countMeetings(firstDay, lastDay)
loop_size = firstDay.size
first = firstDay.sort.first
last = lastDay.sort.last
daily_attendance = {}
(first..last).each do |day|
for ind in 0...loop_size
(firstDay[ind]..lastDay[ind]).lazy.each do |investor_day|
next if daily_attendance.has_value?(ind)
if investor_day == day
daily_attendance[day] = ind
end
end
end
end
daily_attendance.size
end
And it went through the cases with few investors. I thought about using multi-thread and the code became the following:
def countMeetings(firstDay, lastDay)
loop_size = firstDay.size
first = firstDay.sort.first
last = lastDay.sort.last
threads = []
daily_attendance = {}
(first..last).lazy.each_slice(25000) do |slice|
slice.each do |day|
threads << Thread.new do
for ind in 0...loop_size
(firstDay[ind]..lastDay[ind]).lazy.each do |investor_day|
next if daily_attendance.has_value?(ind)
if investor_day == day
daily_attendance[day] = ind
end
end
end
end
end
end
threads.each{|t| t.join}
daily_attendance.size
end
Unfortunately, it went back to the MemoryError
.
Upvotes: 1
Views: 261
Reputation: 165218
This can be done without consuming any more memory than the range of days. The key is to avoid Arrays and keep things as Enumerators as much as possible.
First, rather than the awkward pair of Arrays that need to be converted into Ranges, pass in an Enumerable of Ranges. This both simplifies the method, and it allows it to be Lazy if the list of ranges is very large. It could be read from a file, fetched from a database or an API, or generated by another lazy enumerator. This saves you from requiring big arrays.
Here's an example using an Array of Ranges.
p count_meetings([(1..4), (5..10), (10..10)])
Or to demonstrate transforming your firstDay
and lastDay
Arrays into a lazy Enumerable of Ranges...
firstDays = [1,5,10]
lastDays = [4,10,10]
p count_meetings(
firstDays.lazy.zip(lastDays).map { |first,last|
(first..last)
}
)
firstDays.lazy
makes everything that comes after lazy. .zip(lastDays)
iterates through both Arrays in pairs: [1,4], [5,10], and [10,10]. Then we turn them into Ranges. Because it's lazy it will only map them as needed. This avoids making another big Array.
Now that's fixed, all we need to do is iterate over each Range and increment their attendance for the day.
def count_meetings(attendee_ranges)
# Make a Hash whose default values are 0.
daily_attendance = Hash.new(0)
# For each attendee
attendee_ranges.each { |range|
# For each day they will attend, add one to the attendance for that day.
range.each { |day| daily_attendance[day] += 1 }
}
# Get the day/attendance pair with the maximum value, and only return the value.
daily_attendance.max[1]
end
Memory growth is limited to how big the day range is. If the earliest attendee is on day 1 and the last is on day 1000 daily_attendance is just 1000 entries which is a long time for a conference.
And since you've built the whole Hash anyway, why waste it? Write one function that returns the full attendance, and another that extracts the max.
def count_meeting_attendance(attendee_ranges)
daily_attendance = Hash.new(0)
attendee_ranges.each { |range|
range.each { |day| daily_attendance[day] += 1 }
}
return daily_attendance
end
def max_meeting_attendance(*args)
count_meeting_attendance(*args).max[1]
end
Since this is an exercise and you're stuck with the wonky arguments, we can do the same trick and lazily zip firstDays
and lastDays
together and turn them into Ranges.
def count_meeting_attendance(firstDays, lastDays)
attendee_ranges = firstDays.lazy.zip(lastDays).map { |first,last|
(first..last)
}
daily_attendance = Hash.new(0)
attendee_ranges.each { |range|
range.each { |day| daily_attendance[day] += 1 }
}
return daily_attendance
end
Upvotes: 4