Deepak102ind
Deepak102ind

Reputation: 51

Cassandra Copy command-Connection heartbeat failure

I am getting following error in Cqlsh. The copy command runs for a few seconds and then stops.

Look forward to your help.

Thanks,

Connected to DRM at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.1.8 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
cqlsh> use myworld;
cqlsh:myworld> copy citizens (id, first_name, last_name, house_no, street, city, country,ssn,phone,bank_name,account_no) from '/home/rashmi/Documents/MyData/Road/PeopleData-18-Jun-1.txt';
Processed 110000 rows; Write: 47913.28 rows/s
Connection heartbeat failure
Aborting import at record #1196. Previously inserted records are still present, and some records after that may be present as well.

I have three nodes setup. 192.168.1.10, 11 and 12. 11 being the seed.

CREATE KEYSPACE myworld WITH REPLICATION =  { 'class' : 'SimpleStrategy', 'replication_factor' : 1}

create ColumnFamily citizens (id uuid,
first_name varchar,
last_Name varchar, 
house_no varchar,
street varchar,
city varchar,
country varchar,
ssn varchar,
phone varchar,
bank_Name varchar,
account_no varchar,
PRIMARY KEY ((Country,city),ssn));

Following from Cassandra.yaml

cluster_name: 'DRM'

(##)initial_token: 0
seeds: "192.168.1.11"
listen_address: 192.168.1.11
endpoint_snitch: GossipingPropertyFileSnitch

Upvotes: 2

Views: 1143

Answers (1)

Deepak102ind
Deepak102ind

Reputation: 51

Some update to my own question, if it helps anyone.

Environment

My setup is based on Cassandra 2.2 with Ubuntu 14 on three laptops

  1. I7 MQ 4700/16gigs/1TB drive
  2. I7 MQ 4710/16 gigs/1TB Drive
  3. I7 670/4 Gig/500GB Drive (Old machine)

Keyspace with replication factor of 3. Java Heap of 8GB on first two machines with Max Heap 400 Megs.

Was using wireless network via my internet router.

Objective

Import multiple of 70 Gig CSV files containing 330+ million dummy financial transactions.

Issue

Heartbeat Failure in between. Sometime after importing a few million rows, some after 230 million.

Findings

  1. With Wireless, ping to router and other node were in tune of 200+ ms. Connected the nodes with Cat 5e and Cat 6 cables. Reduced the ping to < .3 MS.

  2. Stopped performing additional heavy disk oriented tasks like copying 70+ gig files in the meanwhile, and querying heavy CQLSH commands like select, querying disk space and 10K data files.

Data Ingestion was regulated to about 9K rows per second, probably using most of the disk.

  1. Third node had disk issues, that also went down in between. Large number of hints.

Present

Import 700+ million rows each day, using one machine at a time. Second simultaneous import process brings up heartbeat error.

Next

Looking to ways to improve ingestion twice the current rate without hardware changes.

Thanks,

Upvotes: 1

Related Questions