Robert Jones
Robert Jones

Reputation: 3

Mule ESB : Read HTML

I have a situation where I have to parse the result of a webpage. In this case, the website does not offer an API to consume to retrieve this data. I have created a flow that calls the website but states:

Message: Error sending HTTP request. Message payload is of type: NullPayload
Any help would be much appreciated.

<http:request-config name="HTTP_Request_Configuration"   host="http://www.resellerratings.com/" port="80" doc:name="HTTP Request Configuration" basePath="/"/>
<flow name="testFlow">
    <http:listener config-ref="HTTP_Listener_Configuration" path="/testReseller" allowedMethods="GET" doc:name="HTTP"/>
    <http:request config-ref="HTTP_Request_Configuration" path="/store/best_buy" method="GET" doc:name="HTTP" sendBodyMode="NEVER"/>
    <logger message="#[message]" level="INFO" doc:name="Logger"/>
</flow>

Upvotes: 0

Views: 1923

Answers (2)

JRichardsz
JRichardsz

Reputation: 16594

Try this :

<?xml version="1.0" encoding="UTF-8"?>

<mule xmlns:http="http://www.mulesoft.org/schema/mule/http" xmlns="http://www.mulesoft.org/schema/mule/core" xmlns:doc="http://www.mulesoft.org/schema/mule/documentation"
    xmlns:spring="http://www.springframework.org/schema/beans" version="EE-3.6.1"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-current.xsd
http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/current/mule.xsd
http://www.mulesoft.org/schema/mule/http http://www.mulesoft.org/schema/mule/http/current/mule-http.xsd">

    <http:request-config name="remote_HTTP_Request_Configuration"   host="www.resellerratings.com" port="80" doc:name="REMOTE HTTP Request Configuration" />
    <http:listener-config name="local_HTTP_Request_Configuration" host="0.0.0.0" port="8081" doc:name="HTTP Listener Configuration"/>

    <flow name="testFlow1">
        <http:listener config-ref="local_HTTP_Request_Configuration" path="/testReseller" allowedMethods="GET" doc:name="HTTP"/>
        <http:request config-ref="remote_HTTP_Request_Configuration" path="/store/best_buy" method="GET" doc:name="HTTP" sendBodyMode="NEVER"/>
        <object-to-string-transformer doc:name="Object to String"/>
        <logger message="#[payload]" level="INFO" doc:name="Logger"/>
    </flow>

</mule>

Go to : http://localhost:8081/testReseller

You get the html page :

enter image description here

Now, in order to get information from this web. I think mule is not an option. You need a tool that will allow you to manipulate the html dom.

This is related to Quality Assurance/Test Automation. And of course our java has fantastic tools for it :

I share my code with you :

  • Jsoup example : Get tittle of video and image from youtube channel

https://github.com/jrichardsz/api-java-rest-service-youtube/blob/master/code/src/test/java/org/jrichardsz/youtubeapi/rest/test/TestJSoup.java

In this example I get all video divs ( specific class ) from a youtube channel, and I get content of and tags.

  • HTMLUnit example : Automate gogole translator:

https://github.com/jrichardsz/appdesktop-super-translator/blob/master/code/src/main/java/com/rnasystems/projects/translator/core/impl/HtmlUnitGoogleUITranslator.java

In this example, I go to google web translator, put some word in left box, press translator button and get the response from right box. All with java.

Finaly , you could use some of this tools as java componente and use mule for invoke it :

<flow name="testFlowHtmlParser">
    <http:listener config-ref="local_HTTP_Request_Configuration" path="/testReseller" allowedMethods="GET" doc:name="HTTP"/>
    <component doc:name="Java" class="com.mycompany.HtmlParserComponent"/>
</flow>

If you need some help about html parser contact me:

http://jrichardsz.weebly.com/

Upvotes: 0

afelisatti
afelisatti

Reputation: 2835

Given your configuration, it probably fails because of the host attribute since it shouldn't include the protocol. Try this instead:

<http:request-config name="HTTP_Request_Configuration" host="www.resellerratings.com" port="80" doc:name="HTTP Request Configuration" /> <flow name="testFlow"> <http:listener config-ref="HTTP_Listener_Configuration" path="/testReseller" allowedMethods="GET" doc:name="HTTP"/> <http:request config-ref="HTTP_Request_Configuration" path="/store/best_buy" method="GET" doc:name="HTTP" sendBodyMode="NEVER"/> <logger message="#[message]" level="INFO" doc:name="Logger"/> </flow>

Upvotes: 1

Related Questions