Reputation: 2353
I have an REST API
which is ment to collect information about urls
used by 3rd party applications. There is a sniffer installed which is looking for HTTP request and then call the API
providing information about urls and headers used.
I would like to store URLs as unique as possible so each search for user:
http://localhost/users/49a95b87-083e-475b-9278-bade6f24413b
http://localhost/users/508f2a55-fe5b-4b83-b853-7e829dd366b8
http://localhost/users/af48be64-ad48-4867-ac06-984ce064dbeb
would be stored as one entry to database (uuid is not important here so it is oke that only last or first will stay). Algorithm to do so is like that:
- Get request
- Check if database contains similiar url
- If database return no findings store url to database
Where controller is:
@PreAuthorize("hasAuthority('ROLE_API')")
@PostMapping(value = "/api/webapp")
public synchronized ResponseEntity<Status> getWebApp(@RequestBody ServiceDiscovery req) throws InterruptedException {
return webAppService.processWebAppRequest(req.getWebApp());
}
service method: webAppService.processWebAppRequest
private final static String UUID_PATTERN = "[a-zA-Z0-9]{8}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{12}";
private final static String SIM_NUMBER_PATTERN = "[0-9]{19}";
private final static String MSISDN_SHORT_PATTERN = "[0-9]{9}";
private final static String MSISDN_LONG_PATTERN = "[0-9]{11}";
@Transactional
public ResponseEntity<Status> processScanWebAppRequest(ServiceDiscovery serviceDiscovery){
try {
Optional<WebApp> wa = checkRegexes(serviceDiscovery.getUrl());
if (wa.isPresent()){
updateExistingWebApplication(serviceDiscovery, wa.get());
} else {
saveNewWebApplication(serviceDiscovery);
}
} catch (IncorrectResultSizeDataAccessException ex) {
return new ResponseEntity<Status>(new Status("Processing error"), HttpStatus.PRECONDITION_FAILED);
}
return new ResponseEntity<Status>(new Status("OK"), HttpStatus.OK);
}
//Checkregex function which return Optional.empty it looks for regex in string and then replace finding with regex itself so it can be used in JPA query
private Optional<WebApp> checkRegexes(String url) {
String urlToLookFor = url.replaceAll(UUID_PATTERN,UUID_PATTERN);
urlToLookFor=urlToLookFor.replaceAll(SIM_NUMBER_PATTERN,SIM_NUMBER_PATTERN);
urlToLookFor=urlToLookFor.replaceAll(MSISDN_LONG_PATTERN,MSISDN_LONG_PATTERN);
urlToLookFor=urlToLookFor.replaceAll(MSISDN_SHORT_PATTERN,MSISDN_SHORT_PATTERN);
return waRepository.getWebAppByRegex(urlToLookFor+"$");
}
//JPA check for
@Query(value="select * from webapp wa where wa.url ~ :url", nativeQuery = true)
Optional<WebApp> getWebAppByRegex(@Param("url") String url);
And everything works good while I test it. But on the production environment I got periodically huge amount of requests (once a day ~5k requests in few seconds) where there are plenty of UUIDs for the same endpoint sent which are being processed parallel so when checkRegexes
is being done there is no finding but 1 second later there are 5.
To avoid this I tried to set API to synchronized
but no success. Is there anyway to make it work without changes to the client?
example output using query is:
db=# select inserted, url from webapp where url ~ 'https://api-gateway.intra.com/users/[a-zA-Z0-9]{8}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{12}$' order by inserted;
inserted | url
---------------------+-----------------------------------------------------------------------------------
2019-10-30 10:55:00 | https://api-gateway.intra.com/users/cd1da115-a2c1-4722-a381-6d524cbf5c95
2019-10-30 10:55:00 | https://api-gateway.intra.com/users/6b9b2c76-f416-4c8c-a0b0-01d29e976c03
2019-10-30 10:55:00 | https://api-gateway.intra.com/users/0f197568-0d2f-405d-8468-3bbea8b3a8ef
2019-10-30 10:55:00 | https://api-gateway.intra.com/users/32e02581-02b4-4a99-9121-1592b0a67566
2019-10-30 10:55:00 | https://api-gateway.intra.com/users/f1bbb4cf-1336-45ec-814f-77ecb9736c47
2019-10-30 10:55:00 | https://api-gateway.intra.com/users/6b0969a1-cfd7-4cda-9041-14229ddb3c6f
2019-10-30 10:55:00 | https://api-gateway.intra.com/users/1fa4e464-aae6-472e-9a4c-cf4c7d6b4e0f
2019-10-30 10:55:00 | https://api-gateway.intra.com/users/a25cbb9c-49c8-4603-aeb6-68d5a065c58e
2019-10-30 10:55:00 | https://api-gateway.intra.com/users/e9be9949-f866-4765-9d38-582193dd1839
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/ad46db32-82dd-4c8e-9e77-a67e59dfca9a
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/fed07e4a-42e4-44e3-b932-9d0e01a5b535
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/747ff6df-dea5-48db-a15b-b7f43bc48ecf
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/4b6341ad-2584-427d-898a-f24d3623eb32
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/1e75c8bb-fb27-4183-a993-6763fa796e79
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/72b19740-7f61-4a32-88f9-0f7196d47853
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/6c21da91-56eb-498b-91bd-4895be3cfdcc
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/a7d3dc5c-c5de-4fc4-9c56-9eefe7a67d80
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/b9658963-c1f5-45da-b78d-240bc7b2a225
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/e8be9c46-e7d0-4642-9c7b-981663aa552e
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/74886e61-150d-4f35-8af0-c911cfdbf009
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/175840ab-9f4c-43b7-a684-1adeb884af71
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/7655bf87-1f34-4a38-9ad0-11c5c1ddbb6b
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/34406b3d-319d-4ca4-9ca2-d53c104ad703
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/15f40dc6-5852-4f9f-9017-87b20dd326f1
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/de5caa58-cb7b-4dac-9b87-b6f4460072f3
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/b601b86b-ba8e-4768-b013-12d12f74362b
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/a64fd22b-f144-4b70-b3bb-86ee0e7a47c6
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/87672dac-ce7d-4a03-acfc-e694d229c4fe
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/ea6db1e8-461b-459a-b5d0-2f4aec872cdd
2019-10-30 11:10:43 | https://api-gateway.intra.com/users/d4b7cd3c-68bd-42e4-99d0-0f5043428080
2019-10-30 11:24:11 | https://api-gateway.intra.com/users/4dfd42cc-4029-46dc-8867-adafac249345
2019-10-30 11:24:11 | https://api-gateway.intra.com/users/2be89a7a-cf26-4163-bd49-0549bba24cd0
2019-10-30 11:24:11 | https://api-gateway.intra.com/users/3d1f3f34-a11d-4f8a-b5d5-7d8fc58e49f0
2019-10-30 11:24:11 | https://api-gateway.intra.com/users/db1ec2c3-5571-4db0-a3fe-6e12a4b1ae46
2019-10-30 11:24:11 | https://api-gateway.intra.com/users/75f964aa-db57-43d3-a424-99345c8c9997
Upvotes: 1
Views: 1639
Reputation: 3572
I would suggest you to synchronize access to DB. So select
,save
and update
operations should be in synchronized block. But it will reduce concurrency to single thread and performance degradation will be enormous. So there is some trick you could use to synchronize the DB access but save the performance at some rate.
You could synchronize DB access by hash of the mormilized url .
I use url_hash %10000
to limit number of lock objects.
Of course it will affect performance but it will be in 10000 better then simple synchronization.
Have a look at the code:
private final static String UUID_PATTERN = "[a-zA-Z0-9]{8}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{12}";
private final static String SIM_NUMBER_PATTERN = "[0-9]{19}";
private final static String MSISDN_SHORT_PATTERN = "[0-9]{9}";
private final static String MSISDN_LONG_PATTERN = "[0-9]{11}";
private final Map<Integer, Object> locks = new ConcurrentHashMap<>();
@Transactional
public ResponseEntity<Status> processScanWebAppRequest(ServiceDiscovery serviceDiscovery){
try {
String urlToLookFor = normalizeUrl(serviceDiscovery.getUrl());
int lockHash = urlToLookFor.hashCode() % 10000;
synchronized (locks.computeIfAbsent(lockHash, integer -> new Object())) {
Optional<WebApp> wa = checkRegexes(urlToLookFor);
if (wa.isPresent()){
updateExistingWebApplication(serviceDiscovery, wa.get());
} else {
saveNewWebApplication(serviceDiscovery);
}
}
} catch (IncorrectResultSizeDataAccessException ex) {
return new ResponseEntity<Status>(new Status("Processing error"), HttpStatus.PRECONDITION_FAILED);
}
return new ResponseEntity<Status>(new Status("OK"), HttpStatus.OK);
}
private String normalizeUrl(String url) {
String urlToLookFor = url.replaceAll(UUID_PATTERN,UUID_PATTERN);
urlToLookFor=urlToLookFor.replaceAll(SIM_NUMBER_PATTERN,SIM_NUMBER_PATTERN);
urlToLookFor=urlToLookFor.replaceAll(MSISDN_LONG_PATTERN,MSISDN_LONG_PATTERN);
urlToLookFor=urlToLookFor.replaceAll(MSISDN_SHORT_PATTERN,MSISDN_SHORT_PATTERN);
return urlToLookFor;
}
private Optional<WebApp> checkRegexes(String url) {
return waRepository.getWebAppByRegex(urlToLookFor+"$");
}
Upvotes: 2