Spring-xd doesn't write stream from Twitter on HDFS

I'm working in my final thesis and I have a problem with Spring-xd:

I run from my xd-shell:

stream create --name cyrustweets --definition "twitterstream --track='mileycyrus, miley cyrus' | log" --deploy

And it works. My xd-singlenode show me the tweets.

But when I try write on HDFS:

stream create --name cyrustweets --definition "twitterstream --track='mileycyrus, miley cyrus' | hdfs" --deploy

The xd-singlenode show me the next:

08:28:05,763 1.0.3.RELEASE WARN twitterSource-1-1 twitter.TwitterStreamChannelAdapter - Exception while reading stream. org.springframework.messaging.MessageHandlingException: failed to write Message payload to HDFS.

Any help? I followed this tutorial: http://hortonworks.com/hadoop-tutorial/using-spring-xd-to-stream-tweets-to-hadoop-for-sentiment-analysis/

Thanks so much

Upvotes: 0

Views: 484

Answers (2)

akhand17
akhand17

Reputation: 16

This error is from source side. Twitter API has some restrictions on streaming time and typically it is determined by your IP address. You will not be able to replicate 100 MB tweets in hardly 30 minutes. From my experience you have to steam them daily over a period of weeks's time to get significant log files.

Upvotes: 0

Gary Russell
Gary Russell

Reputation: 174719

Caused by: java.net.ConnectException: Conexión rehusada

This means the HDFS port and/or host name is incorrect (Connection Refused in English).

If you are using newer versions of Spring-XD, the hadoop connection properties are configured in servers.yml, with these defaults:

# Hadoop properties 
  hadoop:
    fsUri: hdfs://localhost:8020
    resourceManagerHost: localhost
    resourceManagerPort: 8032

Upvotes: 1

Related Questions