Reputation: 51
I'm currently trying to get Spark Streaming over TCP working but I constantly get a "[Errno 111] Connection refused" error...
import socket
TCP_IP = 'localhost'
TCP_PORT = 40123
MESSAGE = "Test data Test data Test data"
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send(MESSAGE)
s.close()
Spark part
import time
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
ssc = StreamingContext(sc,1)
lines = ssc.socketTextStream('localhost',40123)
counts = lines.flatMap(lambda line: line.split(" ")).map(lambda x: (x, 1)).reduceByKey(lambda a, b: a+b)
counts.pprint()
ssc.start()
Upvotes: 2
Views: 3984
Reputation: 11
Try to use the following TCP server code.
data = 'abcdefg'
to_be_sent = data.encode('utf-8')
import socket
# Create a socket with the socket() system call
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Bind the socket to an address using the bind() system call
s.bind(('localhost', 40123))
# Enable a server to accept connections.
# It specifies the number of unaccepted connections
# that the system will allow before refusing new connections
s.listen(1)
# Accept a connection with the accept() system call
conn, addr = s.accept()
# Send data. It continues to send data from bytes until either
# all data has been sent or an error occurs. It returns None.
conn.sendall(to_be_sent)
conn.close()
You may also need to define sc
in yout pyspark park as follow:
sc = SparkContext(appName="TestApp")
Upvotes: 0
Reputation: 76
socketTextStream can not host a server, It's a just a client. You have to make a server yourself and then spark streaming can connect.
Upvotes: 1