parallelizing string matching

Question

I have to mine a large number of datasets and wanted to know if its better to get a desktop with a GPU or just spread the workload over separate machines?

I think with GPU I may have to write my own code using the something like CUDA toolkit.

the number of strings on which I have to perform a regex search is of the order of millions and I have to match a number of different keywords running into 10k so its like ~ 50 billion pattern matches. I want to spread the workload so that a million can be done on one core etc...

Any suggestions would help.

parallelizing string matching

Answers (1)

Related Questions