jvc
jvc

Reputation: 614

Algorithms for mapping data in data mining

I need to scrape some webpages and extract content from them. I'm planning to select some specific keywords and map the data that has some relationship b/w them. But I have no Idea, how I could do that. Could anyone suggest me some algorithms for doing it?.

For example I need to download some webpages about apples and map the relevant data about apples to it and store in database so that, if someone needs specific information about it, I could provide it fastly and accurately.

Also it would be helpful pointing out helpful libraries too. I'm planning to do it in python.

Upvotes: 2

Views: 1172

Answers (2)

riza
riza

Reputation: 17134

Have a look at NLTK, Pattern or Orange modules.

As a start "Programming collective intelligence: building smart web 2. 0 applications" by Toby Segaran is a good book to read.

Upvotes: 1

Manuel Salvadores
Manuel Salvadores

Reputation: 16525

You could try algorithms based on term frequency–inverse document frequency TF-IDF, in Java I would recommend Solr ... well actually you could use Solr and access it with python see here

Upvotes: 1

Related Questions