bitxor
bitxor

Reputation: 31

Put a million rows from flat file to blockchain

I have a flat file with each row containing records such as

AX101 12345 PQR 101
AX102 18390 XYZ 091
AX101 81249 PQR 871

My setup has a few machines connected in a network on Hyperledger with vagrant and docker.

Test 1
For this test, I am running just one validating peer that should read the flat file with over million such rows and put each row as a new block. The intention is to test the speed for doing so. What could be best way to achieve this?

Approach 1 : The code could be in GoLang sitting inside a smart contract
Approach 2 : A seperate "reader" in another language that sends the data to the validating peer via APIs (would be slower, I think)


Test 2
Once (hopefully) when all the data is on blockchain, I need to parse all entries for say AX101, speed is not a concern here but picking up all entries is.

Any pointers would be helpful!

Upvotes: 3

Views: 636

Answers (2)

Sergey Balashevich
Sergey Balashevich

Reputation: 2101

Test 1
There are several arguments why Approach 1 is not the best solution. First of all, if you will try to run initial import in GoLang from smart contract, this import will be executed as a single transaction.
Doesn’t matter where this import is initiated (in “Init” or in “Invoke” methods of your chaincode) in both cases you will bump into “timeout” issue.
The second - this approach ruins the idea of blockchain. Smart-contract should not pull data from external sources (files) because anybody can modify them, as a result entire chain will go into inconsistent state.

Test 2
Hyperledger is not designed to be a database and “parse all entries for say AX101” is not it’s primary goal. Provided description is quite limited, nevertheless there are several ideas for how to "emulate" this behavior:

Possible option 1: You can try to use “RANGE_QUERY_STATE” - it will work only if you try to run search by first part of the string “AX101….”

Possible option 2: Use “AX101” as a key and {“12345 PQR 101”, “81249 PQR 871”} as a value. Such data structure can be build at import time. Works only if you are not going to run queries using other part of the string.

Upvotes: 1

bcbrock
bcbrock

Reputation: 11

This answer assumes you are talking about the Hyperledger fabric:

There is no way to avoid writing a chaincode (smart contract) to add the data to the database. All data in the blockchain is owned by the chaincode that created it, and can only be accessed by the chaincode that created it. There is no concept of shared data, or simply writing data into the blockchain. So you need to do Approach 2, and send the data to a chaincode via its "add new record" method (which you will create).

To access the data you would create a query method for your chaincode. You can control the speed of parsing in Test 2 by the way you store the data. There is Godoc documentation for the database APIs available to the chaincode here: https://godoc.org/github.com/hyperledger/fabric/core/chaincode/shim

Upvotes: 1

Related Questions