Askannz
Askannz

Reputation: 149

Tensorflow : how to share data across inputs?

Apologies if this is a trivial question, or if I am completely taking this problem by the wrong end.

Say I have a dataset that looks like this :

[A, [a,b,c,d]], [B, [e,f,g]], [C, [i,j,k,l,m]], ...

Capital letters represent large data chunks, and lowercase letters smaller chunks. Each large chunk is associated with a variable number of small chunks.

Now, I need to train my network like this : each input datapoint is a pair of type (big chunk, small chunk), associated with a target label.

(A,a) ----> label 1
(A,b) ----> label 2
(A,c) ----> label 3
(A,d) ----> label 4

(B,e) ----> label 5
(B,f) ----> label 6
...

and so on...

As you can see, the big data chunks are re-used across multiple inputs.

I would like to know the best way to input my initial dataset into Tensorflow.


Idea 1 : Obviously I could just straight away rearrange the dataset and turn it into a sequence of datapoints

 (A,a),(A,b),(A,c),(A,d),(B,e),(B,f),...

But that would mean duplicating the large chunks, and be a waste of memory overall.


Idea 2 : I could divide the neural network into two sub-networks like this :

Big chunk ----> Network 1
                     \
                      \
Small chunk -----------\-----> Network 2 ----> Output

This seem more optimized, and I guess there would be a way to factor computation for multiple datapoints with the same big chunk. But how to tell Tensorflow to iterate over two dependent input datasets ?

Upvotes: 1

Views: 72

Answers (1)

Karthik Tsaliki
Karthik Tsaliki

Reputation: 196

You should make your data into batches and feed every batch to your neural network. This concept not only solves your problem, It also scales your problem.

(A,a) ----> label 1
(A,b) ----> label 2
(A,c) ----> label 3
(A,d) ----> label 4

(B,e) ----> label 5
(B,f) ----> label 6

(C,e) ----> label 5
(C,f) ----> label 6

into

Batch 1: (A,a),(A,b),(B,e),(C,f),...
Batch 2: (A,c),(A,d),(C,e),(B,f)...

Apply your cost function. Choose an optimizer and start training your network.

Upvotes: 1

Related Questions