Reputation: 7482
I Have to design an order-book data structure that allows me to query for the highest price of an order which has been inserted and not yet deleted. Insert and delete operations are given upfront in a file in which each operation looks like one of the following two:
where ID is an integer identifier of a order, timestamp are always in increasing order and each ID appears exactly twice: once in a insert and once in a delete operation, in this order.
From this list of operations, I need to output the time-weighted average of the highest price.
As an example, let's say we have the following input:
10 I 1 10
20 I 2 13
22 I 3 13
24 E 2
25 E 3
40 E 1
We can say that after the ith
operation, the max is
10, 13, 13, 13, 10
and the time-weigthed average is
10*(20-10) + 13*(22-20) + 13*(24-22)+13*(25-24)+10*(40-25) = 10.5
because 10
is the max price between timestamps [10-20]
and [25,40]
while 13 in the rest.
I was thinking to use an unordered_map<ID,price>
and a multiset<price>
for supporting:
O(log(n))
O(log(n))
O(1)
Here is an example of what I come up with:
struct order {
int timestamp, id;
char type;
double price;
};
unordered_map<uint, order> M;
multiset<double> maxPrices;
double totaltime = 0;
double avg = 0;
double lastTS = 0;
double getHighest() {
return !maxPrices.empty() ? *maxPrices.rbegin()
: std::numeric_limits<double>::quiet_NaN();
}
void update(const uint timestamp) {
const double timeLeg = timestamp - lastTS;
totaltime += timeLeg;
avg += timeLeg * getHighest();
lastTS = timestamp;
}
void insertOrder(const order& ord) {
if (!maxPrices.empty()) {
if (ord.price >= getHighest()) {
// we have a new maxPrice
update(ord.timestamp);
}
} else // if there are not orders this is the mex for sure
lastTS = ord.timestamp;
M[ord.id] = ord;
maxPrices.insert(ord.price);
}
void deleteOrder(
const uint timestamp,
const uint id_ord) { // id_ord is assumed to exists in both M and maxPrices
order ord = M[id_ord];
if (ord.price >= getHighest()) {
update(timestamp);
}
auto it = maxPrices.find(ord.price);
maxPrices.erase(it);
M.erase(id_ord);
}
This approach has a complexity of nlogn
where n
is the number of active orders.
Is there any faster asymptotic and/or more elegant approach to solving this problem?
Upvotes: 0
Views: 160
Reputation: 57749
I recommend you take the database approach.
Place all your records into a std::vector
.
Create an index table, std::map</* key type */, size_t>
, which will contain a key value and the index of the record in the vector. If you want the key sorted in descending order, also supply a comparison functor.
This strategy allows you to create many index tables without having to re-sort all of your data. The map will give good search times for your keys. You can also iterate through the map to list all the keys in order.
Note: with modern computers, you may need a huge amount of data to provide a significant timing improvement between a binary search (map) and an linear search (vector).
Upvotes: 2