Reputation: 51
I am getting following error in Cqlsh. The copy command runs for a few seconds and then stops.
Look forward to your help.
Thanks,
Connected to DRM at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.1.8 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
cqlsh> use myworld;
cqlsh:myworld> copy citizens (id, first_name, last_name, house_no, street, city, country,ssn,phone,bank_name,account_no) from '/home/rashmi/Documents/MyData/Road/PeopleData-18-Jun-1.txt';
Processed 110000 rows; Write: 47913.28 rows/s
Connection heartbeat failure
Aborting import at record #1196. Previously inserted records are still present, and some records after that may be present as well.
I have three nodes setup. 192.168.1.10, 11 and 12. 11 being the seed.
CREATE KEYSPACE myworld WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1}
create ColumnFamily citizens (id uuid,
first_name varchar,
last_Name varchar,
house_no varchar,
street varchar,
city varchar,
country varchar,
ssn varchar,
phone varchar,
bank_Name varchar,
account_no varchar,
PRIMARY KEY ((Country,city),ssn));
Following from Cassandra.yaml
cluster_name: 'DRM'
(##)initial_token: 0
seeds: "192.168.1.11"
listen_address: 192.168.1.11
endpoint_snitch: GossipingPropertyFileSnitch
Upvotes: 2
Views: 1143
Reputation: 51
Some update to my own question, if it helps anyone.
My setup is based on Cassandra 2.2 with Ubuntu 14 on three laptops
Keyspace with replication factor of 3. Java Heap of 8GB on first two machines with Max Heap 400 Megs.
Was using wireless network via my internet router.
Import multiple of 70 Gig CSV files containing 330+ million dummy financial transactions.
Heartbeat Failure in between. Sometime after importing a few million rows, some after 230 million.
With Wireless, ping to router and other node were in tune of 200+ ms. Connected the nodes with Cat 5e and Cat 6 cables. Reduced the ping to < .3 MS.
Stopped performing additional heavy disk oriented tasks like copying 70+ gig files in the meanwhile, and querying heavy CQLSH commands like select, querying disk space and 10K data files.
Data Ingestion was regulated to about 9K rows per second, probably using most of the disk.
Import 700+ million rows each day, using one machine at a time. Second simultaneous import process brings up heartbeat error.
Looking to ways to improve ingestion twice the current rate without hardware changes.
Thanks,
Upvotes: 1