Reputation: 821
I have never used threads--never thought my code would benefit. However, I think threading might improve performance of the following pseudo code:
Loop through table of records containing security symbol field and a quote field
Load a web page (containing a security quote for a symbol) into a string variable
Parse the string for the quote
Save the quote in the table
Get next record
end loop
Loading each web page takes the most time. Parsing for the quote is quite fast. I guess I could take, say, half the records for one thread and work the other half in a second thread.
Upvotes: 6
Views: 310
Reputation: 26830
In OmniThreadLibrary it is very simple to solve this problem with a multistage pipeline - first stage runs on multiple tasks and downloads web pages and second stage runs in one instance and stores data into the database. I have written a blog post documenting this solution some time ago.
The solution can be summed up with the following code (you would have to fill in some places in HttpGet and Inserter methods).
uses
OtlCommon,
OtlCollections,
OtlParallel;
function HttpGet(url: string; var page: string): boolean;
begin
// retrieve page contents from the url; return False if page is not accessible
end;
procedure Retriever(const input: TOmniValue; var output: TOmniValue);
var
pageContents: string;
begin
if HttpGet(input.AsString, pageContents) then
output := TPage.Create(input.AsString, pageContents);
end;
procedure Inserter(const input, output: IOmniBlockingCollection);
var
page : TOmniValue;
pageObj: TPage;
begin
// connect to database
for page in input do begin
pageObj := TPage(page.AsObject);
// insert pageObj into database
FreeAndNil(pageObj);
end;
// close database connection
end;
procedure ParallelWebRetriever;
var
pipeline: IOmniPipeline;
s : string;
urlList : TStringList;
begin
// set up pipeline
pipeline := Parallel.Pipeline
.Stage(Retriever).NumTasks(Environment.Process.Affinity.Count * 2)
.Stage(Inserter)
.Run;
// insert URLs to be retrieved
for s in urlList do
pipeline.Input.Add(s);
pipeline.Input.CompleteAdding;
// wait for pipeline to complete
pipeline.WaitFor(INFINITE);
end;
Upvotes: 4
Reputation: 596297
If the number of records is relatively small, say 50 or less, you could just launch a separate thread for each record and let them all run in parallel, eg:
begin thread
Load a web page for symbol into a string variable
Parse the string for the quote
Save the quote in the table
end thread
.
Loop through table of records
Launch a thread for current security symbol
Get next record
end loop
If you have a larger number of records to process, consider using a pool of threads so you can handle records in smaller batches, eg:
Create X threads
Put threads in a list
Loop through table of records
Wait until a thread in pool is idle
Get idle thread from pool
Assign current security symbol to thread
Signal thread
Get next record
end loop
Wait for all threads to be idle
Terminate threads
.
begin thread
Loop until terminated
Mark idle
Wait for signal
If not Terminated
Load a web page for current symbol into a string variable
Parse the string for the quote
Save the quote in the table
end if
end loop
end thread
There are many different ways you could implement the above, which is why I left it in pseudocode. Look at the VCL's TThread
, TList
, and TEvent
classes, or the Win32 API QueueUserWorkerItem()
function, or any number of third party threading libraries.
Upvotes: 4