ovntatar
ovntatar

Reputation: 416

How to compile and start erlang application?

How can I start the follow git application after compiling?

my steps are:

1. clone git repository "git://github.com/michaelmelanson/spider.git"
2. cd spider
3  erl
Erlang R14B04 (erts-5.8.5) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]

Eshell V5.8.5  (abort with ^G)
1> make:all().
up_to_date
2> 

Finally how can I show modules related by application?

Thanks in advance.


Thanks for joining Michael, the standard task request "task_master:insert_task("http://www.id.uzh.ch")." working fine. But if I try to limit the recursive requests I receive an error message:

* 1: record task undefined

Unfortunately my suggestion below don't working!

rd(task, {url = "", depth = ""}).
 Task = #task{url="http://www.id.uzh.ch", depth=2}.
 task_master:insert_task(Task).

the next error message is:

=ERROR REPORT==== 21-Jun-2013::09:47:42 ===
** Generic server <0.52.0> terminating 
** Last message in was {'$gen_cast',
                           {task,
                               {task,{task,"http://www.id.uzh.ch",2},[],-1}}}
** When Server state == {state}
** Reason for termination == 
** {{badmatch,{error,parse_url}},
    [{fetcher,process_task,1},
     {fetcher,handle_cast,2},
     {gen_server,handle_msg,5},
     {proc_lib,init_p_do_apply,3}]}`

Any ideas?

Upvotes: 0

Views: 970

Answers (3)

Danil Onishchenko
Danil Onishchenko

Reputation: 2040

You don't have to start Erlang shell to compile your application's sources. You can just do

erlc src/*.erl -o ebin/

in your application's folder.

Also I would suggest you to try rebar:

https://github.com/rebar/rebar

It's an utility which easily allows to compile and test Erlang applications.

Upvotes: 1

Michael Melanson
Michael Melanson

Reputation: 1335

I'm the original author of that code. Sorry I didn't document it at all... It was just a little side project of mine from about 5 years ago. So this is a bit of a distant memory to me, but here's what I know.

johlo is absolutely correct about how to start the application and insert a task. You should be able to start it with application:start(spider), then insert a new job with task_master:insert_task/1 method. It takes either a URL string, or a task record. Let me know if that doesn't work for you.

Once the app is running, doing something like task_master:insert_task("http://someurl.com/page.html") will insert a new task to fetch and process a web page. You can see what 'process' means exactly by looking here:

https://github.com/michaelmelanson/spider/blob/master/src/fetcher.erl#L113

Basically it will fetch the page, parse the HTML, extract any links and send the results back to the task_master. The task_master will then insert new tasks to process each link, recursively spidering all connected pages. Currently it doesn't do anything with the results, but this would be a good place to put that code:

https://github.com/michaelmelanson/spider/blob/master/src/fetcher.erl#L132

Be warned: by default it does not have a limit on the spidering depth. Left to its own devices, it will recursively spider the entire web. If you plan on using this on any site with an outgoing link, you should limit the spidering depth by creating a Task = #task{url="http://someurl.com/", depth=5}; then task_master:insert_task(Task).

Hope that helps.

Upvotes: 3

johlo
johlo

Reputation: 5500

Spider is an erlang application, so application:start/1 can be used to run it:

  1. cd spider

  2. erl -pa ebin

    So erl finds the spider beam files

  3. 1> application:start(inets).

  4. 2> application:start(spider).

You can read more about applications.

Upvotes: 4

Related Questions