kwotsin
kwotsin

Reputation: 2923

TensorFlow: Is there a way to measure FLOPS for a model?

The closest example I can get is found in this issue: https://github.com/tensorflow/tensorflow/issues/899

With this minimum reproducible code:

import tensorflow as tf
import tensorflow.python.framework.ops as ops 
g = tf.Graph()
with g.as_default():
  A = tf.Variable(tf.random_normal( [25,16] ))
  B = tf.Variable(tf.random_normal( [16,9] ))
  C = tf.matmul(A,B) # shape=[25,9]
for op in g.get_operations():
  flops = ops.get_stats_for_node_def(g, op.node_def, 'flops').value
  if flops is not None:
    print 'Flops should be ~',2*25*16*9
    print '25 x 25 x 9 would be',2*25*25*9 # ignores internal dim, repeats first
    print 'TF stats gives',flops

However, the FLOPS returned is always None. Is there a way to concretely measure FLOPS, especially with a PB file?

Upvotes: 27

Views: 33640

Answers (4)

mafu
mafu

Reputation: 32700

Another user posted an answer. It was deleted by a mod so it cannot be restored. But it does solve the problem, and better than other answers. So I repeat it here.


You can use following pip package to get some basic information like model's memory requirement, no. of parameters, flops etc.

https://pypi.org/project/model-profiler

it'll output something like

Model Profile Value Unit
Selected GPUs ['0', '1'] GPU IDs
No. of FLOPs 0.30932349055999997 BFLOPs
GPU Memory Requirement 7.4066760912537575 GB
Model Parameters 138.357544 Million
Memory Required by Model Weights 527.7921447753906 MB

Usage

[Copied verbatim from the library website]

from tensorflow.keras.applications import VGG16

model = VGG16(include_top=True)

from model_profiler import model_profiler

Batch_size = 128
profile = model_profiler(model, Batch_size)

print(profile)

Upvotes: 3

Madhavan Seshadri
Madhavan Seshadri

Reputation: 321

The above approaches no longer work for TF2.0 as the profiler methods have been deprecated and moved under compat.v1. Seems like this feature still needs to be implemented.

Below is an issue on Github: https://github.com/tensorflow/tensorflow/issues/32809

Upvotes: 5

BiBi
BiBi

Reputation: 7908

I would like to build on Tobias Schnek's answer as well as answering the original question: how to get FLOP from a pb file.

Running the first snippet of code from Tobias answer with TensorFlow 1.6.0

g = tf.Graph()
run_meta = tf.RunMetadata()
with g.as_default():
    A = tf.Variable(tf.random_normal([25,16]))
    B = tf.Variable(tf.random_normal([16,9]))
    C = tf.matmul(A,B)

    opts = tf.profiler.ProfileOptionBuilder.float_operation()    
    flops = tf.profiler.profile(g, run_meta=run_meta, cmd='op', options=opts)
    if flops is not None:
        print('Flops should be ~',2*25*16*9)
        print('TF stats gives',flops.total_float_ops)

We get the following ouput:

Flops should be ~ 7200
TF stats gives 8288

So, why do we get 8288 instead of the expected result 7200=2*25*16*9[a]? The answer is in the way the tensors A and B are initialised. Initialising with a Gaussian distribution costs some FLOP. Changing the definition of A and B by

    A = tf.Variable(initial_value=tf.zeros([25, 16]))
    B = tf.Variable(initial_value=tf.zeros([16, 9]))

gives the expected output 7200.

Usually, a network's variables are initialised with Gaussian distributions among other schemes. Most of the time, we are not interested by the initialisation FLOP as they are done once during initialisation and do not happen during the training nor the inference. So, how could one get the exact number of FLOP disregarding the initialisation FLOP?

Freeze the graph with a pb. Calculating the FLOP from a pb file was, actually, the OP's use case.

The following snippet illustrates this:

import tensorflow as tf
from tensorflow.python.framework import graph_util

def load_pb(pb):
    with tf.gfile.GFile(pb, "rb") as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
    with tf.Graph().as_default() as graph:
        tf.import_graph_def(graph_def, name='')
        return graph

# ***** (1) Create Graph *****
g = tf.Graph()
sess = tf.Session(graph=g)
with g.as_default():
    A = tf.Variable(initial_value=tf.random_normal([25, 16]))
    B = tf.Variable(initial_value=tf.random_normal([16, 9]))
    C = tf.matmul(A, B, name='output')
    sess.run(tf.global_variables_initializer())
    flops = tf.profiler.profile(g, options = tf.profiler.ProfileOptionBuilder.float_operation())
    print('FLOP before freezing', flops.total_float_ops)
# *****************************        

# ***** (2) freeze graph *****
output_graph_def = graph_util.convert_variables_to_constants(sess, g.as_graph_def(), ['output'])

with tf.gfile.GFile('graph.pb', "wb") as f:
    f.write(output_graph_def.SerializeToString())
# *****************************


# ***** (3) Load frozen graph *****
g2 = load_pb('./graph.pb')
with g2.as_default():
    flops = tf.profiler.profile(g2, options = tf.profiler.ProfileOptionBuilder.float_operation())
    print('FLOP after freezing', flops.total_float_ops)

outputs

FLOP before freezing 8288
FLOP after freezing 7200

[a] Usually the FLOP of a matrix multiplication are mq(2p -1) for the product AB where A[m, p] and B[p, q] but TensorFlow returns 2mpq for some reason. An issue has been opened to understand why.

Upvotes: 29

Tobias Scheck
Tobias Scheck

Reputation: 633

A little bit late but maybe it helps some visitors in future. For your example I successfully tested the following snippet:

g = tf.Graph()
run_meta = tf.RunMetadata()
with g.as_default():
    A = tf.Variable(tf.random_normal( [25,16] ))
    B = tf.Variable(tf.random_normal( [16,9] ))
    C = tf.matmul(A,B) # shape=[25,9]

    opts = tf.profiler.ProfileOptionBuilder.float_operation()    
    flops = tf.profiler.profile(g, run_meta=run_meta, cmd='op', options=opts)
    if flops is not None:
        print('Flops should be ~',2*25*16*9)
        print('25 x 25 x 9 would be',2*25*25*9) # ignores internal dim, repeats first
        print('TF stats gives',flops.total_float_ops)

It's also possible to use the profiler in combination with Keras like the following snippet:

import tensorflow as tf
import keras.backend as K
from keras.applications.mobilenet import MobileNet

run_meta = tf.RunMetadata()
with tf.Session(graph=tf.Graph()) as sess:
    K.set_session(sess)
    net = MobileNet(alpha=.75, input_tensor=tf.placeholder('float32', shape=(1,32,32,3)))

    opts = tf.profiler.ProfileOptionBuilder.float_operation()    
    flops = tf.profiler.profile(sess.graph, run_meta=run_meta, cmd='op', options=opts)

    opts = tf.profiler.ProfileOptionBuilder.trainable_variables_parameter()    
    params = tf.profiler.profile(sess.graph, run_meta=run_meta, cmd='op', options=opts)

    print("{:,} --- {:,}".format(flops.total_float_ops, params.total_parameters))

I hope I could help!

Upvotes: 23

Related Questions