the_cheff
the_cheff

Reputation: 5040

Tensorflowsharp results getvalue() is very slow

I am using TensorflowSharp to run evaluations using a neural network on an Android phone. I am building the project with Unity.

I am using the tensorflowsharp unity plugin listed under the requirements here: https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Using-TensorFlow-Sharp-in-Unity.md.

Everything is working, however extracting the result is very slow.

The network I am running is an autoencoder and the output is an image with dimensions of 128x128x16 (yes there is a lot of output channels).

The evaluation is done in ~ 0.2 seconds which is acceptable. However when i need to extract the result data using results[0].GetValue() it is VERY slow.

This is my code where i run the neural network

var runner = session.GetRunner();
runner.AddInput(graph[INPUT_NAME][0], tensor).Fetch(graph[OUTPUT_NAME][0]);
var results = runner.Run();

float[,,,] heatmaps = results[0].GetValue() as float[,,,]; // <- this is SLOW

The problem: The last line where i convert the result to floats is taking ~1.2 seconds.

Can it realy be true that reading the result data into a float array is taking more than 5 times as long as the actual evaluation of the network?

Is there another way to extract the result values?

Upvotes: 2

Views: 650

Answers (1)

the_cheff
the_cheff

Reputation: 5040

So I have found a solution to this. I still do not know why the GetValue() call is so slow, but I found another way to retrieve the data.

I chose to manually read the raw tensor data available at results[0].Data

I created a small function to handle this as a drop in for GetValue, (Here just with the dimensions i am expecting hardcoded)

    private float[,,,] TensorToFLoats(TFTensor tensor)
    {

        IntPtr resData = tensor.Data;
        UIntPtr dataSize = tensor.TensorByteSize;

        byte[] s_ImageBuffer = new byte[(int)dataSize];
        System.Runtime.InteropServices.Marshal.Copy(resData, s_ImageBuffer, 0, (int)dataSize);
        int floatsLength = s_ImageBuffer.Length / 4;
        float[] floats = new float[floatsLength];
        for (int n = 0; n < s_ImageBuffer.Length; n += 4)
        {
            floats[n / 4] = BitConverter.ToSingle(s_ImageBuffer, n);
        }
        float[,,,] result = new float[1, 128, 128, 16];


        int i = 0;
        for (int y = 0; y < 128; y++)
        {
            for (int x = 0; x < 128; x++)
            {
                for (int p = 0; p < 16; p++)
                {
                    result[0, y, x, p] = floats[i++];
                }
            }
        }
        return result;
    }

Given this i can replace the code in my question with the following

var runner = session.GetRunner();
runner.AddInput(graph[INPUT_NAME][0], tensor).Fetch(graph[OUTPUT_NAME][0]);
var results = runner.Run();

float[,,,] heatmaps = TensorToFLoats(results[0]);

This is insanely much faster. Where GetValue took ~1 second the TensorToFloats function i created got the same data in ~0.02 seconds

Upvotes: 1

Related Questions