Reputation: 175
When I want to upload data to my "Test Cluster" into Apache Cassandra I open the terminal and then:
export PATH=/home/mypc/dsbulk-1.7.0/bin:$PATH
source ~/.bashrc
dsbulk load -url /home/mypc/Desktop/test/file.csv -k keyspace_test -t table_test
But...
At least 1 record does not match the provided schema.mapping or schema.query. Please check that the connector configuration and the schema configuration are correct.
Operation LOAD_20201105-103000-577734 aborted: Too many errors, the maximum allowed is 100.
total | failed | rows/s | p50ms | p99ms | p999ms | batches
104 | 104 | 0 | 0,00 | 0,00 | 0,00 | 0,00
Rejected records can be found in the following file(s): mapping.bad
Errors are detailed in the following file(s): mapping-errors.log
Last processed positions can be found in positions.txt
What does it means? Why I can't load?
Thank you!
Upvotes: 2
Views: 1780
Reputation: 16393
It means that the columns in the CSV input file does not match the columns in your table_test
table. You can get the details of the schema mismatch in the mapping-errors.log
so you know which column(s) are problematic.
Since the CSV columns don't match the table schema, you will need to manually map them by specifying the --schema.mapping
flag.
For details, see the DSBulk Common options page. You can also have a look at schema mapping examples in this blog post. Cheers!
Upvotes: 2
Reputation: 87329
The error is that you're not providing the mapping between CSV data & table. It could be done 2 ways:
-header true
-m
option (see docs) - you need to map CSV columns into Cassandra columns.There is a very good series of the blog posts about different aspects of DSBulk usage:
the first two of them covers data loading in great details
Upvotes: 4