Untitled

 avatar
unknown
c_cpp
a month ago
146 kB
1
Indexable
Model Input Name: unique_ids_raw_output___9:0, Shape: [0]
Model Input Name: segment_ids:0, Shape: [0, 256]
Model Input Name: input_mask:0, Shape: [0, 256]
Model Input Name: input_ids:0, Shape: [0, 256]
Starting model execution...

Inputs Details:
Input Name: input_ids:0
Shape: (1, 256)
Data (first 10 values): [20201 26146 26630  6768 27993 13863 27651 11274  8810 27497]...
--------------------------------------------------
Input Name: segment_ids:0
Shape: (1, 256)
Data (first 10 values): [0 0 1 0 1 1 0 1 1 0]...
--------------------------------------------------
Input Name: input_mask:0
Shape: (1, 256)
Data (first 10 values): [1 1 0 1 1 1 0 1 0 1]...
--------------------------------------------------
Input Name: unique_ids_raw_output___9:0
Shape: (0,)
Data (first 10 values): []...
--------------------------------------------------
No Add node related to MatMul output: bert/embeddings/MatMul. Executing regular MatMul.
Fusing MatMul with Add for Node: bert/encoder/layer_0/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_0/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_0/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_0/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_0/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_0/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_0/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_0/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_0/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_0/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_1/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_1/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_1/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_1/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_1/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_1/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_1/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_1/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_1/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_1/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_2/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_2/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_2/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_2/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_2/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_2/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_2/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_2/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_2/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_2/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_3/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_3/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_3/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_3/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_3/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_3/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_3/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_3/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_3/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_3/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_4/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_4/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_4/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_4/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_4/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_4/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_4/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_4/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_4/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_4/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_5/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_5/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_5/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_5/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_5/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_5/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_5/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_5/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_5/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_5/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_6/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_6/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_6/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_6/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_6/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_6/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_6/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_6/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_6/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_6/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_7/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_7/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_7/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_7/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_7/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_7/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_7/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_7/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_7/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_7/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_8/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_8/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_8/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_8/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_8/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_8/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_8/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_8/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_8/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_8/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_9/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_9/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_9/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_9/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_9/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_9/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_9/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_9/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_9/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_9/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_10/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_10/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_10/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_10/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_10/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_10/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_10/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_10/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_10/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_10/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_11/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_11/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_11/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_11/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_11/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_11/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_11/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_11/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_11/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_11/output/add
Fusing MatMul with Add for Node: MatMul
Skipping already processed Add Node: BiasAdd

Node Execution Times:
Node: unique_ids_graph_outputs_Identity__10, Execution Time: 0.000004 seconds
Node: bert/encoder/Shape, Execution Time: 0.000004 seconds
Node: bert/encoder/Shape__12, Execution Time: 0.000009 seconds
Node: bert/encoder/strided_slice, Execution Time: 0.000068 seconds
Node: bert/encoder/strided_slice__16, Execution Time: 0.000005 seconds
Node: bert/encoder/strided_slice__17, Execution Time: 0.000004 seconds
Node: bert/encoder/ones/packed_Unsqueeze__18, Execution Time: 0.000015 seconds
Node: bert/encoder/ones/packed_Concat__21, Execution Time: 0.000011 seconds
Node: bert/encoder/ones__22, Execution Time: 0.000004 seconds
Node: bert/encoder/ones, Execution Time: 0.000010 seconds
Node: bert/encoder/Reshape, Execution Time: 0.000006 seconds
Node: bert/encoder/Cast, Execution Time: 0.000004 seconds
Node: bert/encoder/mul, Execution Time: 0.029473 seconds
Node: bert/encoder/layer_9/attention/self/ExpandDims, Execution Time: 0.000034 seconds
Node: bert/encoder/layer_9/attention/self/sub, Execution Time: 0.006731 seconds
Node: bert/encoder/layer_9/attention/self/mul_1, Execution Time: 0.000359 seconds
Node: bert/embeddings/Reshape_2, Execution Time: 0.000050 seconds
Node: bert/embeddings/Reshape, Execution Time: 0.000007 seconds
Node: bert/embeddings/GatherV2, Execution Time: 0.000377 seconds
Node: bert/embeddings/Reshape_1, Execution Time: 0.000007 seconds
Node: bert/embeddings/one_hot, Execution Time: 0.000068 seconds
Node: bert/embeddings/MatMul, Execution Time: 0.060497 seconds
Node: bert/embeddings/Reshape_3, Execution Time: 0.000037 seconds
Node: bert/embeddings/add, Execution Time: 0.002181 seconds
Node: bert/embeddings/add_1, Execution Time: 0.001040 seconds
Node: bert/embeddings/LayerNorm/moments/mean, Execution Time: 0.005559 seconds
Node: bert/embeddings/LayerNorm/moments/SquaredDifference, Execution Time: 0.000734 seconds
Node: bert/embeddings/LayerNorm/moments/SquaredDifference__72, Execution Time: 0.000928 seconds
Node: bert/embeddings/LayerNorm/moments/variance, Execution Time: 0.000274 seconds
Node: bert/embeddings/LayerNorm/batchnorm/add, Execution Time: 0.000090 seconds
Node: bert/embeddings/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.010432 seconds
Node: bert/embeddings/LayerNorm/batchnorm/Rsqrt__74, Execution Time: 0.005194 seconds
Node: bert/embeddings/LayerNorm/batchnorm/mul, Execution Time: 0.000105 seconds
Node: bert/embeddings/LayerNorm/batchnorm/mul_2, Execution Time: 0.000120 seconds
Node: bert/embeddings/LayerNorm/batchnorm/sub, Execution Time: 0.000089 seconds
Node: bert/embeddings/LayerNorm/batchnorm/mul_1, Execution Time: 0.000674 seconds
Node: bert/embeddings/LayerNorm/batchnorm/add_1, Execution Time: 0.000725 seconds
Node: bert/encoder/Reshape_1, Execution Time: 0.000037 seconds
Matmul Fuse Node: bert/encoder/layer_0/attention/self/value/MatMul, Execution Time: 0.031023 seconds
Node: bert/encoder/layer_0/attention/self/Reshape_2, Execution Time: 0.000024 seconds
Node: bert/encoder/layer_0/attention/self/transpose_2, Execution Time: 0.000432 seconds
Matmul Fuse Node: bert/encoder/layer_0/attention/self/query/MatMul, Execution Time: 0.002901 seconds
Node: bert/encoder/layer_0/attention/self/Reshape, Execution Time: 0.000016 seconds
Node: bert/encoder/layer_0/attention/self/transpose, Execution Time: 0.000254 seconds
Matmul Fuse Node: bert/encoder/layer_0/attention/self/key/MatMul, Execution Time: 0.002810 seconds
Node: bert/encoder/layer_0/attention/self/Reshape_1, Execution Time: 0.000016 seconds
Node: bert/encoder/layer_0/attention/self/MatMul__306, Execution Time: 0.000322 seconds
Node: bert/encoder/layer_0/attention/self/MatMul, Execution Time: 0.005185 seconds
Node: bert/encoder/layer_0/attention/self/Mul, Execution Time: 0.001247 seconds
Node: bert/encoder/layer_0/attention/self/add, Execution Time: 0.001908 seconds
Node: bert/encoder/layer_0/attention/self/Softmax, Execution Time: 0.013543 seconds
Node: bert/encoder/layer_0/attention/self/MatMul_1, Execution Time: 0.001892 seconds
Node: bert/encoder/layer_0/attention/self/transpose_3, Execution Time: 0.000263 seconds
Node: bert/encoder/layer_0/attention/self/Reshape_3, Execution Time: 0.000043 seconds
Matmul Fuse Node: bert/encoder/layer_0/attention/output/dense/MatMul, Execution Time: 0.001390 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/moments/mean, Execution Time: 0.000208 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000246 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/moments/SquaredDifference__309, Execution Time: 0.000502 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/moments/variance, Execution Time: 0.000160 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000123 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000085 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/Rsqrt__311, Execution Time: 0.000123 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000113 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000104 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000557 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000683 seconds
Matmul Fuse Node: bert/encoder/layer_0/intermediate/dense/MatMul, Execution Time: 0.005011 seconds
Node: bert/encoder/layer_0/intermediate/dense/Pow, Execution Time: 0.017370 seconds
Node: bert/encoder/layer_0/intermediate/dense/mul, Execution Time: 0.001233 seconds
Node: bert/encoder/layer_0/intermediate/dense/add, Execution Time: 0.001573 seconds
Node: bert/encoder/layer_0/intermediate/dense/mul_1, Execution Time: 0.001162 seconds
Node: bert/encoder/layer_0/intermediate/dense/Tanh, Execution Time: 0.003605 seconds
Node: bert/encoder/layer_0/intermediate/dense/add_1, Execution Time: 0.001278 seconds
Node: bert/encoder/layer_0/intermediate/dense/mul_2, Execution Time: 0.001427 seconds
Node: bert/encoder/layer_0/intermediate/dense/mul_3, Execution Time: 0.001574 seconds
Matmul Fuse Node: bert/encoder/layer_0/output/dense/MatMul, Execution Time: 0.002862 seconds
Node: bert/encoder/layer_0/output/LayerNorm/moments/mean, Execution Time: 0.000178 seconds
Node: bert/encoder/layer_0/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000243 seconds
Node: bert/encoder/layer_0/output/LayerNorm/moments/SquaredDifference__313, Execution Time: 0.000337 seconds
Node: bert/encoder/layer_0/output/LayerNorm/moments/variance, Execution Time: 0.000158 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/add, Execution Time: 0.000083 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/Rsqrt__315, Execution Time: 0.000074 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/mul, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/sub, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000343 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000426 seconds
Matmul Fuse Node: bert/encoder/layer_1/attention/self/value/MatMul, Execution Time: 0.002846 seconds
Node: bert/encoder/layer_1/attention/self/Reshape_2, Execution Time: 0.000021 seconds
Node: bert/encoder/layer_1/attention/self/transpose_2, Execution Time: 0.000252 seconds
Matmul Fuse Node: bert/encoder/layer_1/attention/self/query/MatMul, Execution Time: 0.002622 seconds
Node: bert/encoder/layer_1/attention/self/Reshape, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_1/attention/self/transpose, Execution Time: 0.000248 seconds
Matmul Fuse Node: bert/encoder/layer_1/attention/self/key/MatMul, Execution Time: 0.002566 seconds
Node: bert/encoder/layer_1/attention/self/Reshape_1, Execution Time: 0.000012 seconds
Node: bert/encoder/layer_1/attention/self/MatMul__320, Execution Time: 0.000247 seconds
Node: bert/encoder/layer_1/attention/self/MatMul, Execution Time: 0.001395 seconds
Node: bert/encoder/layer_1/attention/self/Mul, Execution Time: 0.001206 seconds
Node: bert/encoder/layer_1/attention/self/add, Execution Time: 0.001890 seconds
Node: bert/encoder/layer_1/attention/self/Softmax, Execution Time: 0.002451 seconds
Node: bert/encoder/layer_1/attention/self/MatMul_1, Execution Time: 0.001359 seconds
Node: bert/encoder/layer_1/attention/self/transpose_3, Execution Time: 0.000303 seconds
Node: bert/encoder/layer_1/attention/self/Reshape_3, Execution Time: 0.000055 seconds
Matmul Fuse Node: bert/encoder/layer_1/attention/output/dense/MatMul, Execution Time: 0.001400 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/moments/mean, Execution Time: 0.000198 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000259 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/moments/SquaredDifference__323, Execution Time: 0.000511 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/moments/variance, Execution Time: 0.000170 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000080 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000062 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/Rsqrt__325, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000089 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000093 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000684 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000756 seconds
Matmul Fuse Node: bert/encoder/layer_1/intermediate/dense/MatMul, Execution Time: 0.004641 seconds
Node: bert/encoder/layer_1/intermediate/dense/Pow, Execution Time: 0.000746 seconds
Node: bert/encoder/layer_1/intermediate/dense/mul, Execution Time: 0.001337 seconds
Node: bert/encoder/layer_1/intermediate/dense/add, Execution Time: 0.001703 seconds
Node: bert/encoder/layer_1/intermediate/dense/mul_1, Execution Time: 0.001359 seconds
Node: bert/encoder/layer_1/intermediate/dense/Tanh, Execution Time: 0.001474 seconds
Node: bert/encoder/layer_1/intermediate/dense/add_1, Execution Time: 0.001539 seconds
Node: bert/encoder/layer_1/intermediate/dense/mul_2, Execution Time: 0.001379 seconds
Node: bert/encoder/layer_1/intermediate/dense/mul_3, Execution Time: 0.001708 seconds
Matmul Fuse Node: bert/encoder/layer_1/output/dense/MatMul, Execution Time: 0.002904 seconds
Node: bert/encoder/layer_1/output/LayerNorm/moments/mean, Execution Time: 0.000198 seconds
Node: bert/encoder/layer_1/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000254 seconds
Node: bert/encoder/layer_1/output/LayerNorm/moments/SquaredDifference__327, Execution Time: 0.000345 seconds
Node: bert/encoder/layer_1/output/LayerNorm/moments/variance, Execution Time: 0.000169 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/add, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000065 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/Rsqrt__329, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/mul, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/sub, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000346 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000455 seconds
Matmul Fuse Node: bert/encoder/layer_2/attention/self/value/MatMul, Execution Time: 0.003318 seconds
Node: bert/encoder/layer_2/attention/self/Reshape_2, Execution Time: 0.000024 seconds
Node: bert/encoder/layer_2/attention/self/transpose_2, Execution Time: 0.000264 seconds
Matmul Fuse Node: bert/encoder/layer_2/attention/self/query/MatMul, Execution Time: 0.002859 seconds
Node: bert/encoder/layer_2/attention/self/Reshape, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_2/attention/self/transpose, Execution Time: 0.000252 seconds
Matmul Fuse Node: bert/encoder/layer_2/attention/self/key/MatMul, Execution Time: 0.002846 seconds
Node: bert/encoder/layer_2/attention/self/Reshape_1, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_2/attention/self/MatMul__334, Execution Time: 0.000257 seconds
Node: bert/encoder/layer_2/attention/self/MatMul, Execution Time: 0.001547 seconds
Node: bert/encoder/layer_2/attention/self/Mul, Execution Time: 0.001303 seconds
Node: bert/encoder/layer_2/attention/self/add, Execution Time: 0.002226 seconds
Node: bert/encoder/layer_2/attention/self/Softmax, Execution Time: 0.002406 seconds
Node: bert/encoder/layer_2/attention/self/MatMul_1, Execution Time: 0.001246 seconds
Node: bert/encoder/layer_2/attention/self/transpose_3, Execution Time: 0.000253 seconds
Node: bert/encoder/layer_2/attention/self/Reshape_3, Execution Time: 0.000060 seconds
Matmul Fuse Node: bert/encoder/layer_2/attention/output/dense/MatMul, Execution Time: 0.001356 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/moments/mean, Execution Time: 0.000186 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000257 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/moments/SquaredDifference__337, Execution Time: 0.000349 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/moments/variance, Execution Time: 0.000164 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000065 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/Rsqrt__339, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000092 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000088 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000089 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000334 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000934 seconds
Matmul Fuse Node: bert/encoder/layer_2/intermediate/dense/MatMul, Execution Time: 0.004746 seconds
Node: bert/encoder/layer_2/intermediate/dense/Pow, Execution Time: 0.000761 seconds
Node: bert/encoder/layer_2/intermediate/dense/mul, Execution Time: 0.001376 seconds
Node: bert/encoder/layer_2/intermediate/dense/add, Execution Time: 0.001893 seconds
Node: bert/encoder/layer_2/intermediate/dense/mul_1, Execution Time: 0.001427 seconds
Node: bert/encoder/layer_2/intermediate/dense/Tanh, Execution Time: 0.001340 seconds
Node: bert/encoder/layer_2/intermediate/dense/add_1, Execution Time: 0.001351 seconds
Node: bert/encoder/layer_2/intermediate/dense/mul_2, Execution Time: 0.001364 seconds
Node: bert/encoder/layer_2/intermediate/dense/mul_3, Execution Time: 0.001718 seconds
Matmul Fuse Node: bert/encoder/layer_2/output/dense/MatMul, Execution Time: 0.002814 seconds
Node: bert/encoder/layer_2/output/LayerNorm/moments/mean, Execution Time: 0.000201 seconds
Node: bert/encoder/layer_2/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000253 seconds
Node: bert/encoder/layer_2/output/LayerNorm/moments/SquaredDifference__341, Execution Time: 0.000345 seconds
Node: bert/encoder/layer_2/output/LayerNorm/moments/variance, Execution Time: 0.000173 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/add, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000066 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/Rsqrt__343, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/mul, Execution Time: 0.000088 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/sub, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000329 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000446 seconds
Matmul Fuse Node: bert/encoder/layer_3/attention/self/value/MatMul, Execution Time: 0.003196 seconds
Node: bert/encoder/layer_3/attention/self/Reshape_2, Execution Time: 0.000024 seconds
Node: bert/encoder/layer_3/attention/self/transpose_2, Execution Time: 0.000274 seconds
Matmul Fuse Node: bert/encoder/layer_3/attention/self/query/MatMul, Execution Time: 0.002961 seconds
Node: bert/encoder/layer_3/attention/self/Reshape, Execution Time: 0.000016 seconds
Node: bert/encoder/layer_3/attention/self/transpose, Execution Time: 0.000265 seconds
Matmul Fuse Node: bert/encoder/layer_3/attention/self/key/MatMul, Execution Time: 0.002869 seconds
Node: bert/encoder/layer_3/attention/self/Reshape_1, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_3/attention/self/MatMul__348, Execution Time: 0.000258 seconds
Node: bert/encoder/layer_3/attention/self/MatMul, Execution Time: 0.001526 seconds
Node: bert/encoder/layer_3/attention/self/Mul, Execution Time: 0.001327 seconds
Node: bert/encoder/layer_3/attention/self/add, Execution Time: 0.002196 seconds
Node: bert/encoder/layer_3/attention/self/Softmax, Execution Time: 0.002330 seconds
Node: bert/encoder/layer_3/attention/self/MatMul_1, Execution Time: 0.001241 seconds
Node: bert/encoder/layer_3/attention/self/transpose_3, Execution Time: 0.000261 seconds
Node: bert/encoder/layer_3/attention/self/Reshape_3, Execution Time: 0.000058 seconds
Matmul Fuse Node: bert/encoder/layer_3/attention/output/dense/MatMul, Execution Time: 0.001283 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/moments/mean, Execution Time: 0.000198 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000278 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/moments/SquaredDifference__351, Execution Time: 0.000351 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/moments/variance, Execution Time: 0.000166 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000065 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/Rsqrt__353, Execution Time: 0.000088 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000088 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000121 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000647 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000765 seconds
Matmul Fuse Node: bert/encoder/layer_3/intermediate/dense/MatMul, Execution Time: 0.005178 seconds
Node: bert/encoder/layer_3/intermediate/dense/Pow, Execution Time: 0.000777 seconds
Node: bert/encoder/layer_3/intermediate/dense/mul, Execution Time: 0.001322 seconds
Node: bert/encoder/layer_3/intermediate/dense/add, Execution Time: 0.001694 seconds
Node: bert/encoder/layer_3/intermediate/dense/mul_1, Execution Time: 0.001363 seconds
Node: bert/encoder/layer_3/intermediate/dense/Tanh, Execution Time: 0.001367 seconds
Node: bert/encoder/layer_3/intermediate/dense/add_1, Execution Time: 0.001364 seconds
Node: bert/encoder/layer_3/intermediate/dense/mul_2, Execution Time: 0.001386 seconds
Node: bert/encoder/layer_3/intermediate/dense/mul_3, Execution Time: 0.001735 seconds
Matmul Fuse Node: bert/encoder/layer_3/output/dense/MatMul, Execution Time: 0.002793 seconds
Node: bert/encoder/layer_3/output/LayerNorm/moments/mean, Execution Time: 0.000198 seconds
Node: bert/encoder/layer_3/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000254 seconds
Node: bert/encoder/layer_3/output/LayerNorm/moments/SquaredDifference__355, Execution Time: 0.000350 seconds
Node: bert/encoder/layer_3/output/LayerNorm/moments/variance, Execution Time: 0.000165 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/add, Execution Time: 0.000089 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000067 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/Rsqrt__357, Execution Time: 0.000093 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/mul, Execution Time: 0.000102 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/sub, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000331 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000437 seconds
Matmul Fuse Node: bert/encoder/layer_4/attention/self/value/MatMul, Execution Time: 0.003309 seconds
Node: bert/encoder/layer_4/attention/self/Reshape_2, Execution Time: 0.000024 seconds
Node: bert/encoder/layer_4/attention/self/transpose_2, Execution Time: 0.000269 seconds
Matmul Fuse Node: bert/encoder/layer_4/attention/self/query/MatMul, Execution Time: 0.002913 seconds
Node: bert/encoder/layer_4/attention/self/Reshape, Execution Time: 0.000016 seconds
Node: bert/encoder/layer_4/attention/self/transpose, Execution Time: 0.000258 seconds
Matmul Fuse Node: bert/encoder/layer_4/attention/self/key/MatMul, Execution Time: 0.002810 seconds
Node: bert/encoder/layer_4/attention/self/Reshape_1, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_4/attention/self/MatMul__362, Execution Time: 0.000257 seconds
Node: bert/encoder/layer_4/attention/self/MatMul, Execution Time: 0.001524 seconds
Node: bert/encoder/layer_4/attention/self/Mul, Execution Time: 0.001288 seconds
Node: bert/encoder/layer_4/attention/self/add, Execution Time: 0.002067 seconds
Node: bert/encoder/layer_4/attention/self/Softmax, Execution Time: 0.002591 seconds
Node: bert/encoder/layer_4/attention/self/MatMul_1, Execution Time: 0.001322 seconds
Node: bert/encoder/layer_4/attention/self/transpose_3, Execution Time: 0.000259 seconds
Node: bert/encoder/layer_4/attention/self/Reshape_3, Execution Time: 0.000069 seconds
Matmul Fuse Node: bert/encoder/layer_4/attention/output/dense/MatMul, Execution Time: 0.001811 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/moments/mean, Execution Time: 0.000319 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000281 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/moments/SquaredDifference__365, Execution Time: 0.000349 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/moments/variance, Execution Time: 0.000166 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000101 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000066 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/Rsqrt__367, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000092 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000656 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000790 seconds
Matmul Fuse Node: bert/encoder/layer_4/intermediate/dense/MatMul, Execution Time: 0.004804 seconds
Node: bert/encoder/layer_4/intermediate/dense/Pow, Execution Time: 0.000735 seconds
Node: bert/encoder/layer_4/intermediate/dense/mul, Execution Time: 0.001317 seconds
Node: bert/encoder/layer_4/intermediate/dense/add, Execution Time: 0.001773 seconds
Node: bert/encoder/layer_4/intermediate/dense/mul_1, Execution Time: 0.001177 seconds
Node: bert/encoder/layer_4/intermediate/dense/Tanh, Execution Time: 0.001105 seconds
Node: bert/encoder/layer_4/intermediate/dense/add_1, Execution Time: 0.001135 seconds
Node: bert/encoder/layer_4/intermediate/dense/mul_2, Execution Time: 0.001167 seconds
Node: bert/encoder/layer_4/intermediate/dense/mul_3, Execution Time: 0.001374 seconds
Matmul Fuse Node: bert/encoder/layer_4/output/dense/MatMul, Execution Time: 0.002819 seconds
Node: bert/encoder/layer_4/output/LayerNorm/moments/mean, Execution Time: 0.000196 seconds
Node: bert/encoder/layer_4/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000250 seconds
Node: bert/encoder/layer_4/output/LayerNorm/moments/SquaredDifference__369, Execution Time: 0.000369 seconds
Node: bert/encoder/layer_4/output/LayerNorm/moments/variance, Execution Time: 0.000168 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/add, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000066 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/Rsqrt__371, Execution Time: 0.000086 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/mul, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000083 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/sub, Execution Time: 0.000093 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000336 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000502 seconds
Matmul Fuse Node: bert/encoder/layer_5/attention/self/value/MatMul, Execution Time: 0.004371 seconds
Node: bert/encoder/layer_5/attention/self/Reshape_2, Execution Time: 0.000025 seconds
Node: bert/encoder/layer_5/attention/self/transpose_2, Execution Time: 0.000261 seconds
Matmul Fuse Node: bert/encoder/layer_5/attention/self/query/MatMul, Execution Time: 0.002672 seconds
Node: bert/encoder/layer_5/attention/self/Reshape, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_5/attention/self/transpose, Execution Time: 0.000262 seconds
Matmul Fuse Node: bert/encoder/layer_5/attention/self/key/MatMul, Execution Time: 0.002717 seconds
Node: bert/encoder/layer_5/attention/self/Reshape_1, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_5/attention/self/MatMul__376, Execution Time: 0.000262 seconds
Node: bert/encoder/layer_5/attention/self/MatMul, Execution Time: 0.001562 seconds
Node: bert/encoder/layer_5/attention/self/Mul, Execution Time: 0.001664 seconds
Node: bert/encoder/layer_5/attention/self/add, Execution Time: 0.002286 seconds
Node: bert/encoder/layer_5/attention/self/Softmax, Execution Time: 0.003442 seconds
Node: bert/encoder/layer_5/attention/self/MatMul_1, Execution Time: 0.002026 seconds
Node: bert/encoder/layer_5/attention/self/transpose_3, Execution Time: 0.000359 seconds
Node: bert/encoder/layer_5/attention/self/Reshape_3, Execution Time: 0.000121 seconds
Matmul Fuse Node: bert/encoder/layer_5/attention/output/dense/MatMul, Execution Time: 0.001645 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/moments/mean, Execution Time: 0.000262 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000337 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/moments/SquaredDifference__379, Execution Time: 0.000455 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/moments/variance, Execution Time: 0.000208 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000117 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/Rsqrt__381, Execution Time: 0.000113 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000124 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000107 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000108 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000670 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000759 seconds
Matmul Fuse Node: bert/encoder/layer_5/intermediate/dense/MatMul, Execution Time: 0.004532 seconds
Node: bert/encoder/layer_5/intermediate/dense/Pow, Execution Time: 0.000731 seconds
Node: bert/encoder/layer_5/intermediate/dense/mul, Execution Time: 0.001324 seconds
Node: bert/encoder/layer_5/intermediate/dense/add, Execution Time: 0.001900 seconds
Node: bert/encoder/layer_5/intermediate/dense/mul_1, Execution Time: 0.001656 seconds
Node: bert/encoder/layer_5/intermediate/dense/Tanh, Execution Time: 0.001851 seconds
Node: bert/encoder/layer_5/intermediate/dense/add_1, Execution Time: 0.001699 seconds
Node: bert/encoder/layer_5/intermediate/dense/mul_2, Execution Time: 0.001656 seconds
Node: bert/encoder/layer_5/intermediate/dense/mul_3, Execution Time: 0.002165 seconds
Matmul Fuse Node: bert/encoder/layer_5/output/dense/MatMul, Execution Time: 0.003160 seconds
Node: bert/encoder/layer_5/output/LayerNorm/moments/mean, Execution Time: 0.000202 seconds
Node: bert/encoder/layer_5/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000256 seconds
Node: bert/encoder/layer_5/output/LayerNorm/moments/SquaredDifference__383, Execution Time: 0.000363 seconds
Node: bert/encoder/layer_5/output/LayerNorm/moments/variance, Execution Time: 0.000166 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/add, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000065 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/Rsqrt__385, Execution Time: 0.000094 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/mul, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/sub, Execution Time: 0.000092 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000328 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000473 seconds
Matmul Fuse Node: bert/encoder/layer_6/attention/self/value/MatMul, Execution Time: 0.003479 seconds
Node: bert/encoder/layer_6/attention/self/Reshape_2, Execution Time: 0.000024 seconds
Node: bert/encoder/layer_6/attention/self/transpose_2, Execution Time: 0.000279 seconds
Matmul Fuse Node: bert/encoder/layer_6/attention/self/query/MatMul, Execution Time: 0.003534 seconds
Node: bert/encoder/layer_6/attention/self/Reshape, Execution Time: 0.000068 seconds
Node: bert/encoder/layer_6/attention/self/transpose, Execution Time: 0.000273 seconds
Matmul Fuse Node: bert/encoder/layer_6/attention/self/key/MatMul, Execution Time: 0.003088 seconds
Node: bert/encoder/layer_6/attention/self/Reshape_1, Execution Time: 0.000022 seconds
Node: bert/encoder/layer_6/attention/self/MatMul__390, Execution Time: 0.000259 seconds
Node: bert/encoder/layer_6/attention/self/MatMul, Execution Time: 0.001803 seconds
Node: bert/encoder/layer_6/attention/self/Mul, Execution Time: 0.001834 seconds
Node: bert/encoder/layer_6/attention/self/add, Execution Time: 0.002524 seconds
Node: bert/encoder/layer_6/attention/self/Softmax, Execution Time: 0.002284 seconds
Node: bert/encoder/layer_6/attention/self/MatMul_1, Execution Time: 0.001416 seconds
Node: bert/encoder/layer_6/attention/self/transpose_3, Execution Time: 0.000263 seconds
Node: bert/encoder/layer_6/attention/self/Reshape_3, Execution Time: 0.000065 seconds
Matmul Fuse Node: bert/encoder/layer_6/attention/output/dense/MatMul, Execution Time: 0.001497 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/moments/mean, Execution Time: 0.000253 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000266 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/moments/SquaredDifference__393, Execution Time: 0.000355 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/moments/variance, Execution Time: 0.000189 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000104 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000074 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/Rsqrt__395, Execution Time: 0.000100 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000105 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000097 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000097 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000538 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000814 seconds
Matmul Fuse Node: bert/encoder/layer_6/intermediate/dense/MatMul, Execution Time: 0.005059 seconds
Node: bert/encoder/layer_6/intermediate/dense/Pow, Execution Time: 0.000759 seconds
Node: bert/encoder/layer_6/intermediate/dense/mul, Execution Time: 0.001392 seconds
Node: bert/encoder/layer_6/intermediate/dense/add, Execution Time: 0.001679 seconds
Node: bert/encoder/layer_6/intermediate/dense/mul_1, Execution Time: 0.001218 seconds
Node: bert/encoder/layer_6/intermediate/dense/Tanh, Execution Time: 0.001140 seconds
Node: bert/encoder/layer_6/intermediate/dense/add_1, Execution Time: 0.001200 seconds
Node: bert/encoder/layer_6/intermediate/dense/mul_2, Execution Time: 0.001525 seconds
Node: bert/encoder/layer_6/intermediate/dense/mul_3, Execution Time: 0.001487 seconds
Matmul Fuse Node: bert/encoder/layer_6/output/dense/MatMul, Execution Time: 0.002849 seconds
Node: bert/encoder/layer_6/output/LayerNorm/moments/mean, Execution Time: 0.000196 seconds
Node: bert/encoder/layer_6/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000270 seconds
Node: bert/encoder/layer_6/output/LayerNorm/moments/SquaredDifference__397, Execution Time: 0.000353 seconds
Node: bert/encoder/layer_6/output/LayerNorm/moments/variance, Execution Time: 0.000168 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/add, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000064 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/Rsqrt__399, Execution Time: 0.000089 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/mul, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/sub, Execution Time: 0.000103 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000399 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000506 seconds
Matmul Fuse Node: bert/encoder/layer_7/attention/self/value/MatMul, Execution Time: 0.003528 seconds
Node: bert/encoder/layer_7/attention/self/Reshape_2, Execution Time: 0.000024 seconds
Node: bert/encoder/layer_7/attention/self/transpose_2, Execution Time: 0.000266 seconds
Matmul Fuse Node: bert/encoder/layer_7/attention/self/query/MatMul, Execution Time: 0.002841 seconds
Node: bert/encoder/layer_7/attention/self/Reshape, Execution Time: 0.000015 seconds
Node: bert/encoder/layer_7/attention/self/transpose, Execution Time: 0.000277 seconds
Matmul Fuse Node: bert/encoder/layer_7/attention/self/key/MatMul, Execution Time: 0.002693 seconds
Node: bert/encoder/layer_7/attention/self/Reshape_1, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_7/attention/self/MatMul__404, Execution Time: 0.000254 seconds
Node: bert/encoder/layer_7/attention/self/MatMul, Execution Time: 0.001493 seconds
Node: bert/encoder/layer_7/attention/self/Mul, Execution Time: 0.001176 seconds
Node: bert/encoder/layer_7/attention/self/add, Execution Time: 0.001767 seconds
Node: bert/encoder/layer_7/attention/self/Softmax, Execution Time: 0.001926 seconds
Node: bert/encoder/layer_7/attention/self/MatMul_1, Execution Time: 0.001131 seconds
Node: bert/encoder/layer_7/attention/self/transpose_3, Execution Time: 0.000245 seconds
Node: bert/encoder/layer_7/attention/self/Reshape_3, Execution Time: 0.000052 seconds
Matmul Fuse Node: bert/encoder/layer_7/attention/output/dense/MatMul, Execution Time: 0.001315 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/moments/mean, Execution Time: 0.000198 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000283 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/moments/SquaredDifference__407, Execution Time: 0.000343 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/moments/variance, Execution Time: 0.000163 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000094 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000066 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/Rsqrt__409, Execution Time: 0.000086 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000086 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000088 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000602 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000737 seconds
Matmul Fuse Node: bert/encoder/layer_7/intermediate/dense/MatMul, Execution Time: 0.004372 seconds
Node: bert/encoder/layer_7/intermediate/dense/Pow, Execution Time: 0.000726 seconds
Node: bert/encoder/layer_7/intermediate/dense/mul, Execution Time: 0.001186 seconds
Node: bert/encoder/layer_7/intermediate/dense/add, Execution Time: 0.001428 seconds
Node: bert/encoder/layer_7/intermediate/dense/mul_1, Execution Time: 0.001146 seconds
Node: bert/encoder/layer_7/intermediate/dense/Tanh, Execution Time: 0.001076 seconds
Node: bert/encoder/layer_7/intermediate/dense/add_1, Execution Time: 0.001104 seconds
Node: bert/encoder/layer_7/intermediate/dense/mul_2, Execution Time: 0.001114 seconds
Node: bert/encoder/layer_7/intermediate/dense/mul_3, Execution Time: 0.001438 seconds
Matmul Fuse Node: bert/encoder/layer_7/output/dense/MatMul, Execution Time: 0.002922 seconds
Node: bert/encoder/layer_7/output/LayerNorm/moments/mean, Execution Time: 0.000199 seconds
Node: bert/encoder/layer_7/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000262 seconds
Node: bert/encoder/layer_7/output/LayerNorm/moments/SquaredDifference__411, Execution Time: 0.000351 seconds
Node: bert/encoder/layer_7/output/LayerNorm/moments/variance, Execution Time: 0.000172 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/add, Execution Time: 0.000102 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000077 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/Rsqrt__413, Execution Time: 0.000107 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/mul, Execution Time: 0.000113 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000107 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/sub, Execution Time: 0.000110 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000367 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000520 seconds
Matmul Fuse Node: bert/encoder/layer_8/attention/self/value/MatMul, Execution Time: 0.003624 seconds
Node: bert/encoder/layer_8/attention/self/Reshape_2, Execution Time: 0.000025 seconds
Node: bert/encoder/layer_8/attention/self/transpose_2, Execution Time: 0.000271 seconds
Matmul Fuse Node: bert/encoder/layer_8/attention/self/query/MatMul, Execution Time: 0.002746 seconds
Node: bert/encoder/layer_8/attention/self/Reshape, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_8/attention/self/transpose, Execution Time: 0.000253 seconds
Matmul Fuse Node: bert/encoder/layer_8/attention/self/key/MatMul, Execution Time: 0.002611 seconds
Node: bert/encoder/layer_8/attention/self/Reshape_1, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_8/attention/self/MatMul__418, Execution Time: 0.000258 seconds
Node: bert/encoder/layer_8/attention/self/MatMul, Execution Time: 0.001474 seconds
Node: bert/encoder/layer_8/attention/self/Mul, Execution Time: 0.001234 seconds
Node: bert/encoder/layer_8/attention/self/add, Execution Time: 0.001760 seconds
Node: bert/encoder/layer_8/attention/self/Softmax, Execution Time: 0.001936 seconds
Node: bert/encoder/layer_8/attention/self/MatMul_1, Execution Time: 0.001112 seconds
Node: bert/encoder/layer_8/attention/self/transpose_3, Execution Time: 0.000244 seconds
Node: bert/encoder/layer_8/attention/self/Reshape_3, Execution Time: 0.000051 seconds
Matmul Fuse Node: bert/encoder/layer_8/attention/output/dense/MatMul, Execution Time: 0.001265 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/moments/mean, Execution Time: 0.000198 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000256 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/moments/SquaredDifference__421, Execution Time: 0.000479 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/moments/variance, Execution Time: 0.000169 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000092 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000064 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/Rsqrt__423, Execution Time: 0.000086 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000096 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000086 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000088 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000602 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000705 seconds
Matmul Fuse Node: bert/encoder/layer_8/intermediate/dense/MatMul, Execution Time: 0.004451 seconds
Node: bert/encoder/layer_8/intermediate/dense/Pow, Execution Time: 0.000748 seconds
Node: bert/encoder/layer_8/intermediate/dense/mul, Execution Time: 0.001153 seconds
Node: bert/encoder/layer_8/intermediate/dense/add, Execution Time: 0.001473 seconds
Node: bert/encoder/layer_8/intermediate/dense/mul_1, Execution Time: 0.001107 seconds
Node: bert/encoder/layer_8/intermediate/dense/Tanh, Execution Time: 0.001070 seconds
Node: bert/encoder/layer_8/intermediate/dense/add_1, Execution Time: 0.001121 seconds
Node: bert/encoder/layer_8/intermediate/dense/mul_2, Execution Time: 0.001117 seconds
Node: bert/encoder/layer_8/intermediate/dense/mul_3, Execution Time: 0.001413 seconds
Matmul Fuse Node: bert/encoder/layer_8/output/dense/MatMul, Execution Time: 0.003117 seconds
Node: bert/encoder/layer_8/output/LayerNorm/moments/mean, Execution Time: 0.000231 seconds
Node: bert/encoder/layer_8/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000328 seconds
Node: bert/encoder/layer_8/output/LayerNorm/moments/SquaredDifference__425, Execution Time: 0.000444 seconds
Node: bert/encoder/layer_8/output/LayerNorm/moments/variance, Execution Time: 0.000173 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/add, Execution Time: 0.000098 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000066 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/Rsqrt__427, Execution Time: 0.000089 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/mul, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/sub, Execution Time: 0.000093 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000345 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000465 seconds
Matmul Fuse Node: bert/encoder/layer_9/attention/self/value/MatMul, Execution Time: 0.003488 seconds
Node: bert/encoder/layer_9/attention/self/Reshape_2, Execution Time: 0.000024 seconds
Node: bert/encoder/layer_9/attention/self/transpose_2, Execution Time: 0.000309 seconds
Matmul Fuse Node: bert/encoder/layer_9/attention/self/query/MatMul, Execution Time: 0.002822 seconds
Node: bert/encoder/layer_9/attention/self/Reshape, Execution Time: 0.000015 seconds
Node: bert/encoder/layer_9/attention/self/transpose, Execution Time: 0.000254 seconds
Matmul Fuse Node: bert/encoder/layer_9/attention/self/key/MatMul, Execution Time: 0.002754 seconds
Node: bert/encoder/layer_9/attention/self/Reshape_1, Execution Time: 0.000015 seconds
Node: bert/encoder/layer_9/attention/self/MatMul__432, Execution Time: 0.000256 seconds
Node: bert/encoder/layer_9/attention/self/MatMul, Execution Time: 0.001503 seconds
Node: bert/encoder/layer_9/attention/self/Mul, Execution Time: 0.001231 seconds
Node: bert/encoder/layer_9/attention/self/add, Execution Time: 0.001827 seconds
Node: bert/encoder/layer_9/attention/self/Softmax, Execution Time: 0.001956 seconds
Node: bert/encoder/layer_9/attention/self/MatMul_1, Execution Time: 0.001155 seconds
Node: bert/encoder/layer_9/attention/self/transpose_3, Execution Time: 0.000246 seconds
Node: bert/encoder/layer_9/attention/self/Reshape_3, Execution Time: 0.000056 seconds
Matmul Fuse Node: bert/encoder/layer_9/attention/output/dense/MatMul, Execution Time: 0.001328 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/moments/mean, Execution Time: 0.000201 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000271 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/moments/SquaredDifference__435, Execution Time: 0.000350 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/moments/variance, Execution Time: 0.000163 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000107 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000065 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/Rsqrt__437, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000099 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000088 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000598 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000726 seconds
Matmul Fuse Node: bert/encoder/layer_9/intermediate/dense/MatMul, Execution Time: 0.004899 seconds
Node: bert/encoder/layer_9/intermediate/dense/Pow, Execution Time: 0.000729 seconds
Node: bert/encoder/layer_9/intermediate/dense/mul, Execution Time: 0.001312 seconds
Node: bert/encoder/layer_9/intermediate/dense/add, Execution Time: 0.001624 seconds
Node: bert/encoder/layer_9/intermediate/dense/mul_1, Execution Time: 0.001606 seconds
Node: bert/encoder/layer_9/intermediate/dense/Tanh, Execution Time: 0.001328 seconds
Node: bert/encoder/layer_9/intermediate/dense/add_1, Execution Time: 0.001363 seconds
Node: bert/encoder/layer_9/intermediate/dense/mul_2, Execution Time: 0.001334 seconds
Node: bert/encoder/layer_9/intermediate/dense/mul_3, Execution Time: 0.001719 seconds
Matmul Fuse Node: bert/encoder/layer_9/output/dense/MatMul, Execution Time: 0.003271 seconds
Node: bert/encoder/layer_9/output/LayerNorm/moments/mean, Execution Time: 0.000205 seconds
Node: bert/encoder/layer_9/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000263 seconds
Node: bert/encoder/layer_9/output/LayerNorm/moments/SquaredDifference__439, Execution Time: 0.000350 seconds
Node: bert/encoder/layer_9/output/LayerNorm/moments/variance, Execution Time: 0.000167 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/add, Execution Time: 0.000093 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000072 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/Rsqrt__441, Execution Time: 0.000094 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/mul, Execution Time: 0.000098 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000089 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/sub, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000326 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000458 seconds
Matmul Fuse Node: bert/encoder/layer_10/attention/self/value/MatMul, Execution Time: 0.004776 seconds
Node: bert/encoder/layer_10/attention/self/Reshape_2, Execution Time: 0.000025 seconds
Node: bert/encoder/layer_10/attention/self/transpose_2, Execution Time: 0.000267 seconds
Matmul Fuse Node: bert/encoder/layer_10/attention/self/query/MatMul, Execution Time: 0.002937 seconds
Node: bert/encoder/layer_10/attention/self/Reshape, Execution Time: 0.000017 seconds
Node: bert/encoder/layer_10/attention/self/transpose, Execution Time: 0.000268 seconds
Matmul Fuse Node: bert/encoder/layer_10/attention/self/key/MatMul, Execution Time: 0.002784 seconds
Node: bert/encoder/layer_10/attention/self/Reshape_1, Execution Time: 0.000016 seconds
Node: bert/encoder/layer_10/attention/self/MatMul__446, Execution Time: 0.000262 seconds
Node: bert/encoder/layer_10/attention/self/MatMul, Execution Time: 0.001536 seconds
Node: bert/encoder/layer_10/attention/self/Mul, Execution Time: 0.001282 seconds
Node: bert/encoder/layer_10/attention/self/add, Execution Time: 0.002008 seconds
Node: bert/encoder/layer_10/attention/self/Softmax, Execution Time: 0.002279 seconds
Node: bert/encoder/layer_10/attention/self/MatMul_1, Execution Time: 0.001209 seconds
Node: bert/encoder/layer_10/attention/self/transpose_3, Execution Time: 0.000251 seconds
Node: bert/encoder/layer_10/attention/self/Reshape_3, Execution Time: 0.000054 seconds
Matmul Fuse Node: bert/encoder/layer_10/attention/output/dense/MatMul, Execution Time: 0.001207 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/moments/mean, Execution Time: 0.000205 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000254 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/moments/SquaredDifference__449, Execution Time: 0.000342 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/moments/variance, Execution Time: 0.000165 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000092 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000062 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/Rsqrt__451, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000097 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000335 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000435 seconds
Matmul Fuse Node: bert/encoder/layer_10/intermediate/dense/MatMul, Execution Time: 0.004598 seconds
Node: bert/encoder/layer_10/intermediate/dense/Pow, Execution Time: 0.000743 seconds
Node: bert/encoder/layer_10/intermediate/dense/mul, Execution Time: 0.001343 seconds
Node: bert/encoder/layer_10/intermediate/dense/add, Execution Time: 0.001630 seconds
Node: bert/encoder/layer_10/intermediate/dense/mul_1, Execution Time: 0.001354 seconds
Node: bert/encoder/layer_10/intermediate/dense/Tanh, Execution Time: 0.001309 seconds
Node: bert/encoder/layer_10/intermediate/dense/add_1, Execution Time: 0.001358 seconds
Node: bert/encoder/layer_10/intermediate/dense/mul_2, Execution Time: 0.001358 seconds
Node: bert/encoder/layer_10/intermediate/dense/mul_3, Execution Time: 0.001723 seconds
Matmul Fuse Node: bert/encoder/layer_10/output/dense/MatMul, Execution Time: 0.002816 seconds
Node: bert/encoder/layer_10/output/LayerNorm/moments/mean, Execution Time: 0.000207 seconds
Node: bert/encoder/layer_10/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000259 seconds
Node: bert/encoder/layer_10/output/LayerNorm/moments/SquaredDifference__453, Execution Time: 0.000344 seconds
Node: bert/encoder/layer_10/output/LayerNorm/moments/variance, Execution Time: 0.000167 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/add, Execution Time: 0.000093 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000072 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/Rsqrt__455, Execution Time: 0.000178 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/mul, Execution Time: 0.000103 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/sub, Execution Time: 0.000089 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000339 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000446 seconds
Matmul Fuse Node: bert/encoder/layer_11/attention/self/value/MatMul, Execution Time: 0.003723 seconds
Node: bert/encoder/layer_11/attention/self/Reshape_2, Execution Time: 0.000025 seconds
Node: bert/encoder/layer_11/attention/self/transpose_2, Execution Time: 0.000271 seconds
Matmul Fuse Node: bert/encoder/layer_11/attention/self/query/MatMul, Execution Time: 0.002970 seconds
Node: bert/encoder/layer_11/attention/self/Reshape, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_11/attention/self/transpose, Execution Time: 0.000265 seconds
Matmul Fuse Node: bert/encoder/layer_11/attention/self/key/MatMul, Execution Time: 0.002878 seconds
Node: bert/encoder/layer_11/attention/self/Reshape_1, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_11/attention/self/MatMul__460, Execution Time: 0.000257 seconds
Node: bert/encoder/layer_11/attention/self/MatMul, Execution Time: 0.001513 seconds
Node: bert/encoder/layer_11/attention/self/Mul, Execution Time: 0.001281 seconds
Node: bert/encoder/layer_11/attention/self/add, Execution Time: 0.002046 seconds
Node: bert/encoder/layer_11/attention/self/Softmax, Execution Time: 0.002472 seconds
Node: bert/encoder/layer_11/attention/self/MatMul_1, Execution Time: 0.001688 seconds
Node: bert/encoder/layer_11/attention/self/transpose_3, Execution Time: 0.000344 seconds
Node: bert/encoder/layer_11/attention/self/Reshape_3, Execution Time: 0.000085 seconds
Matmul Fuse Node: bert/encoder/layer_11/attention/output/dense/MatMul, Execution Time: 0.001657 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/moments/mean, Execution Time: 0.000261 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000317 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/moments/SquaredDifference__463, Execution Time: 0.000621 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/moments/variance, Execution Time: 0.000215 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000106 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000080 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/Rsqrt__465, Execution Time: 0.000113 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000109 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000114 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000122 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000745 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000932 seconds
Matmul Fuse Node: bert/encoder/layer_11/intermediate/dense/MatMul, Execution Time: 0.005398 seconds
Node: bert/encoder/layer_11/intermediate/dense/Pow, Execution Time: 0.000755 seconds
Node: bert/encoder/layer_11/intermediate/dense/mul, Execution Time: 0.001696 seconds
Node: bert/encoder/layer_11/intermediate/dense/add, Execution Time: 0.002457 seconds
Node: bert/encoder/layer_11/intermediate/dense/mul_1, Execution Time: 0.001978 seconds
Node: bert/encoder/layer_11/intermediate/dense/Tanh, Execution Time: 0.001770 seconds
Node: bert/encoder/layer_11/intermediate/dense/add_1, Execution Time: 0.001795 seconds
Node: bert/encoder/layer_11/intermediate/dense/mul_2, Execution Time: 0.001863 seconds
Node: bert/encoder/layer_11/intermediate/dense/mul_3, Execution Time: 0.001921 seconds
Matmul Fuse Node: bert/encoder/layer_11/output/dense/MatMul, Execution Time: 0.003049 seconds
Node: bert/encoder/layer_11/output/LayerNorm/moments/mean, Execution Time: 0.000231 seconds
Node: bert/encoder/layer_11/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000329 seconds
Node: bert/encoder/layer_11/output/LayerNorm/moments/SquaredDifference__467, Execution Time: 0.000438 seconds
Node: bert/encoder/layer_11/output/LayerNorm/moments/variance, Execution Time: 0.000189 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/add, Execution Time: 0.000154 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000168 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/Rsqrt__469, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/mul, Execution Time: 0.000094 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000083 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/sub, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000334 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000467 seconds
Matmul Fuse Node: MatMul, Execution Time: 0.004092 seconds
Node: Reshape_1, Execution Time: 0.000024 seconds
Node: transpose, Execution Time: 0.000088 seconds
Node: unstack, Execution Time: 0.000056 seconds
Node: unstack__490, Execution Time: 0.000006 seconds
Node: unstack__488, Execution Time: 0.000006 seconds

Total Execution Time: 0.727311 seconds

Total Matmul Fuse Execution Time: 0.253194 seconds
Execution complete.

Total execution time: 0.731414 seconds
Model outputs: {'unstack:1': array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype=float16), 'unstack:0': array(None, dtype=object), 'unique_ids:0': array([], dtype=int64)}
Execution order: ['unique_ids_graph_outputs_Identity__10', 'bert/encoder/Shape', 'bert/encoder/Shape__12', 'bert/encoder/strided_slice', 'bert/encoder/strided_slice__16', 'bert/encoder/strided_slice__17', 'bert/encoder/ones/packed_Unsqueeze__18', 'bert/encoder/ones/packed_Concat__21', 'bert/encoder/ones__22', 'bert/encoder/ones', 'bert/encoder/Reshape', 'bert/encoder/Cast', 'bert/encoder/mul', 'bert/encoder/layer_9/attention/self/ExpandDims', 'bert/encoder/layer_9/attention/self/sub', 'bert/encoder/layer_9/attention/self/mul_1', 'bert/embeddings/Reshape_2', 'bert/embeddings/Reshape', 'bert/embeddings/GatherV2', 'bert/embeddings/Reshape_1', 'bert/embeddings/one_hot', 'bert/embeddings/MatMul', 'bert/embeddings/Reshape_3', 'bert/embeddings/add', 'bert/embeddings/add_1', 'bert/embeddings/LayerNorm/moments/mean', 'bert/embeddings/LayerNorm/moments/SquaredDifference', 'bert/embeddings/LayerNorm/moments/SquaredDifference__72', 'bert/embeddings/LayerNorm/moments/variance', 'bert/embeddings/LayerNorm/batchnorm/add', 'bert/embeddings/LayerNorm/batchnorm/Rsqrt', 'bert/embeddings/LayerNorm/batchnorm/Rsqrt__74', 'bert/embeddings/LayerNorm/batchnorm/mul', 'bert/embeddings/LayerNorm/batchnorm/mul_2', 'bert/embeddings/LayerNorm/batchnorm/sub', 'bert/embeddings/LayerNorm/batchnorm/mul_1', 'bert/embeddings/LayerNorm/batchnorm/add_1', 'bert/encoder/Reshape_1', 'bert/encoder/layer_0/attention/self/value/MatMul', 'bert/encoder/layer_0/attention/self/value/BiasAdd', 'bert/encoder/layer_0/attention/self/Reshape_2', 'bert/encoder/layer_0/attention/self/transpose_2', 'bert/encoder/layer_0/attention/self/query/MatMul', 'bert/encoder/layer_0/attention/self/query/BiasAdd', 'bert/encoder/layer_0/attention/self/Reshape', 'bert/encoder/layer_0/attention/self/transpose', 'bert/encoder/layer_0/attention/self/key/MatMul', 'bert/encoder/layer_0/attention/self/key/BiasAdd', 'bert/encoder/layer_0/attention/self/Reshape_1', 'bert/encoder/layer_0/attention/self/MatMul__306', 'bert/encoder/layer_0/attention/self/MatMul', 'bert/encoder/layer_0/attention/self/Mul', 'bert/encoder/layer_0/attention/self/add', 'bert/encoder/layer_0/attention/self/Softmax', 'bert/encoder/layer_0/attention/self/MatMul_1', 'bert/encoder/layer_0/attention/self/transpose_3', 'bert/encoder/layer_0/attention/self/Reshape_3', 'bert/encoder/layer_0/attention/output/dense/MatMul', 'bert/encoder/layer_0/attention/output/dense/BiasAdd', 'bert/encoder/layer_0/attention/output/add', 'bert/encoder/layer_0/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_0/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_0/attention/output/LayerNorm/moments/SquaredDifference__309', 'bert/encoder/layer_0/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/Rsqrt__311', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_0/intermediate/dense/MatMul', 'bert/encoder/layer_0/intermediate/dense/BiasAdd', 'bert/encoder/layer_0/intermediate/dense/Pow', 'bert/encoder/layer_0/intermediate/dense/mul', 'bert/encoder/layer_0/intermediate/dense/add', 'bert/encoder/layer_0/intermediate/dense/mul_1', 'bert/encoder/layer_0/intermediate/dense/Tanh', 'bert/encoder/layer_0/intermediate/dense/add_1', 'bert/encoder/layer_0/intermediate/dense/mul_2', 'bert/encoder/layer_0/intermediate/dense/mul_3', 'bert/encoder/layer_0/output/dense/MatMul', 'bert/encoder/layer_0/output/dense/BiasAdd', 'bert/encoder/layer_0/output/add', 'bert/encoder/layer_0/output/LayerNorm/moments/mean', 'bert/encoder/layer_0/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_0/output/LayerNorm/moments/SquaredDifference__313', 'bert/encoder/layer_0/output/LayerNorm/moments/variance', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/Rsqrt__315', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_1/attention/self/value/MatMul', 'bert/encoder/layer_1/attention/self/value/BiasAdd', 'bert/encoder/layer_1/attention/self/Reshape_2', 'bert/encoder/layer_1/attention/self/transpose_2', 'bert/encoder/layer_1/attention/self/query/MatMul', 'bert/encoder/layer_1/attention/self/query/BiasAdd', 'bert/encoder/layer_1/attention/self/Reshape', 'bert/encoder/layer_1/attention/self/transpose', 'bert/encoder/layer_1/attention/self/key/MatMul', 'bert/encoder/layer_1/attention/self/key/BiasAdd', 'bert/encoder/layer_1/attention/self/Reshape_1', 'bert/encoder/layer_1/attention/self/MatMul__320', 'bert/encoder/layer_1/attention/self/MatMul', 'bert/encoder/layer_1/attention/self/Mul', 'bert/encoder/layer_1/attention/self/add', 'bert/encoder/layer_1/attention/self/Softmax', 'bert/encoder/layer_1/attention/self/MatMul_1', 'bert/encoder/layer_1/attention/self/transpose_3', 'bert/encoder/layer_1/attention/self/Reshape_3', 'bert/encoder/layer_1/attention/output/dense/MatMul', 'bert/encoder/layer_1/attention/output/dense/BiasAdd', 'bert/encoder/layer_1/attention/output/add', 'bert/encoder/layer_1/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_1/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_1/attention/output/LayerNorm/moments/SquaredDifference__323', 'bert/encoder/layer_1/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/Rsqrt__325', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_1/intermediate/dense/MatMul', 'bert/encoder/layer_1/intermediate/dense/BiasAdd', 'bert/encoder/layer_1/intermediate/dense/Pow', 'bert/encoder/layer_1/intermediate/dense/mul', 'bert/encoder/layer_1/intermediate/dense/add', 'bert/encoder/layer_1/intermediate/dense/mul_1', 'bert/encoder/layer_1/intermediate/dense/Tanh', 'bert/encoder/layer_1/intermediate/dense/add_1', 'bert/encoder/layer_1/intermediate/dense/mul_2', 'bert/encoder/layer_1/intermediate/dense/mul_3', 'bert/encoder/layer_1/output/dense/MatMul', 'bert/encoder/layer_1/output/dense/BiasAdd', 'bert/encoder/layer_1/output/add', 'bert/encoder/layer_1/output/LayerNorm/moments/mean', 'bert/encoder/layer_1/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_1/output/LayerNorm/moments/SquaredDifference__327', 'bert/encoder/layer_1/output/LayerNorm/moments/variance', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/Rsqrt__329', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_2/attention/self/value/MatMul', 'bert/encoder/layer_2/attention/self/value/BiasAdd', 'bert/encoder/layer_2/attention/self/Reshape_2', 'bert/encoder/layer_2/attention/self/transpose_2', 'bert/encoder/layer_2/attention/self/query/MatMul', 'bert/encoder/layer_2/attention/self/query/BiasAdd', 'bert/encoder/layer_2/attention/self/Reshape', 'bert/encoder/layer_2/attention/self/transpose', 'bert/encoder/layer_2/attention/self/key/MatMul', 'bert/encoder/layer_2/attention/self/key/BiasAdd', 'bert/encoder/layer_2/attention/self/Reshape_1', 'bert/encoder/layer_2/attention/self/MatMul__334', 'bert/encoder/layer_2/attention/self/MatMul', 'bert/encoder/layer_2/attention/self/Mul', 'bert/encoder/layer_2/attention/self/add', 'bert/encoder/layer_2/attention/self/Softmax', 'bert/encoder/layer_2/attention/self/MatMul_1', 'bert/encoder/layer_2/attention/self/transpose_3', 'bert/encoder/layer_2/attention/self/Reshape_3', 'bert/encoder/layer_2/attention/output/dense/MatMul', 'bert/encoder/layer_2/attention/output/dense/BiasAdd', 'bert/encoder/layer_2/attention/output/add', 'bert/encoder/layer_2/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_2/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_2/attention/output/LayerNorm/moments/SquaredDifference__337', 'bert/encoder/layer_2/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/Rsqrt__339', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_2/intermediate/dense/MatMul', 'bert/encoder/layer_2/intermediate/dense/BiasAdd', 'bert/encoder/layer_2/intermediate/dense/Pow', 'bert/encoder/layer_2/intermediate/dense/mul', 'bert/encoder/layer_2/intermediate/dense/add', 'bert/encoder/layer_2/intermediate/dense/mul_1', 'bert/encoder/layer_2/intermediate/dense/Tanh', 'bert/encoder/layer_2/intermediate/dense/add_1', 'bert/encoder/layer_2/intermediate/dense/mul_2', 'bert/encoder/layer_2/intermediate/dense/mul_3', 'bert/encoder/layer_2/output/dense/MatMul', 'bert/encoder/layer_2/output/dense/BiasAdd', 'bert/encoder/layer_2/output/add', 'bert/encoder/layer_2/output/LayerNorm/moments/mean', 'bert/encoder/layer_2/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_2/output/LayerNorm/moments/SquaredDifference__341', 'bert/encoder/layer_2/output/LayerNorm/moments/variance', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/Rsqrt__343', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_3/attention/self/value/MatMul', 'bert/encoder/layer_3/attention/self/value/BiasAdd', 'bert/encoder/layer_3/attention/self/Reshape_2', 'bert/encoder/layer_3/attention/self/transpose_2', 'bert/encoder/layer_3/attention/self/query/MatMul', 'bert/encoder/layer_3/attention/self/query/BiasAdd', 'bert/encoder/layer_3/attention/self/Reshape', 'bert/encoder/layer_3/attention/self/transpose', 'bert/encoder/layer_3/attention/self/key/MatMul', 'bert/encoder/layer_3/attention/self/key/BiasAdd', 'bert/encoder/layer_3/attention/self/Reshape_1', 'bert/encoder/layer_3/attention/self/MatMul__348', 'bert/encoder/layer_3/attention/self/MatMul', 'bert/encoder/layer_3/attention/self/Mul', 'bert/encoder/layer_3/attention/self/add', 'bert/encoder/layer_3/attention/self/Softmax', 'bert/encoder/layer_3/attention/self/MatMul_1', 'bert/encoder/layer_3/attention/self/transpose_3', 'bert/encoder/layer_3/attention/self/Reshape_3', 'bert/encoder/layer_3/attention/output/dense/MatMul', 'bert/encoder/layer_3/attention/output/dense/BiasAdd', 'bert/encoder/layer_3/attention/output/add', 'bert/encoder/layer_3/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_3/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_3/attention/output/LayerNorm/moments/SquaredDifference__351', 'bert/encoder/layer_3/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/Rsqrt__353', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_3/intermediate/dense/MatMul', 'bert/encoder/layer_3/intermediate/dense/BiasAdd', 'bert/encoder/layer_3/intermediate/dense/Pow', 'bert/encoder/layer_3/intermediate/dense/mul', 'bert/encoder/layer_3/intermediate/dense/add', 'bert/encoder/layer_3/intermediate/dense/mul_1', 'bert/encoder/layer_3/intermediate/dense/Tanh', 'bert/encoder/layer_3/intermediate/dense/add_1', 'bert/encoder/layer_3/intermediate/dense/mul_2', 'bert/encoder/layer_3/intermediate/dense/mul_3', 'bert/encoder/layer_3/output/dense/MatMul', 'bert/encoder/layer_3/output/dense/BiasAdd', 'bert/encoder/layer_3/output/add', 'bert/encoder/layer_3/output/LayerNorm/moments/mean', 'bert/encoder/layer_3/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_3/output/LayerNorm/moments/SquaredDifference__355', 'bert/encoder/layer_3/output/LayerNorm/moments/variance', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/Rsqrt__357', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_4/attention/self/value/MatMul', 'bert/encoder/layer_4/attention/self/value/BiasAdd', 'bert/encoder/layer_4/attention/self/Reshape_2', 'bert/encoder/layer_4/attention/self/transpose_2', 'bert/encoder/layer_4/attention/self/query/MatMul', 'bert/encoder/layer_4/attention/self/query/BiasAdd', 'bert/encoder/layer_4/attention/self/Reshape', 'bert/encoder/layer_4/attention/self/transpose', 'bert/encoder/layer_4/attention/self/key/MatMul', 'bert/encoder/layer_4/attention/self/key/BiasAdd', 'bert/encoder/layer_4/attention/self/Reshape_1', 'bert/encoder/layer_4/attention/self/MatMul__362', 'bert/encoder/layer_4/attention/self/MatMul', 'bert/encoder/layer_4/attention/self/Mul', 'bert/encoder/layer_4/attention/self/add', 'bert/encoder/layer_4/attention/self/Softmax', 'bert/encoder/layer_4/attention/self/MatMul_1', 'bert/encoder/layer_4/attention/self/transpose_3', 'bert/encoder/layer_4/attention/self/Reshape_3', 'bert/encoder/layer_4/attention/output/dense/MatMul', 'bert/encoder/layer_4/attention/output/dense/BiasAdd', 'bert/encoder/layer_4/attention/output/add', 'bert/encoder/layer_4/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_4/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_4/attention/output/LayerNorm/moments/SquaredDifference__365', 'bert/encoder/layer_4/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/Rsqrt__367', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_4/intermediate/dense/MatMul', 'bert/encoder/layer_4/intermediate/dense/BiasAdd', 'bert/encoder/layer_4/intermediate/dense/Pow', 'bert/encoder/layer_4/intermediate/dense/mul', 'bert/encoder/layer_4/intermediate/dense/add', 'bert/encoder/layer_4/intermediate/dense/mul_1', 'bert/encoder/layer_4/intermediate/dense/Tanh', 'bert/encoder/layer_4/intermediate/dense/add_1', 'bert/encoder/layer_4/intermediate/dense/mul_2', 'bert/encoder/layer_4/intermediate/dense/mul_3', 'bert/encoder/layer_4/output/dense/MatMul', 'bert/encoder/layer_4/output/dense/BiasAdd', 'bert/encoder/layer_4/output/add', 'bert/encoder/layer_4/output/LayerNorm/moments/mean', 'bert/encoder/layer_4/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_4/output/LayerNorm/moments/SquaredDifference__369', 'bert/encoder/layer_4/output/LayerNorm/moments/variance', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/Rsqrt__371', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_5/attention/self/value/MatMul', 'bert/encoder/layer_5/attention/self/value/BiasAdd', 'bert/encoder/layer_5/attention/self/Reshape_2', 'bert/encoder/layer_5/attention/self/transpose_2', 'bert/encoder/layer_5/attention/self/query/MatMul', 'bert/encoder/layer_5/attention/self/query/BiasAdd', 'bert/encoder/layer_5/attention/self/Reshape', 'bert/encoder/layer_5/attention/self/transpose', 'bert/encoder/layer_5/attention/self/key/MatMul', 'bert/encoder/layer_5/attention/self/key/BiasAdd', 'bert/encoder/layer_5/attention/self/Reshape_1', 'bert/encoder/layer_5/attention/self/MatMul__376', 'bert/encoder/layer_5/attention/self/MatMul', 'bert/encoder/layer_5/attention/self/Mul', 'bert/encoder/layer_5/attention/self/add', 'bert/encoder/layer_5/attention/self/Softmax', 'bert/encoder/layer_5/attention/self/MatMul_1', 'bert/encoder/layer_5/attention/self/transpose_3', 'bert/encoder/layer_5/attention/self/Reshape_3', 'bert/encoder/layer_5/attention/output/dense/MatMul', 'bert/encoder/layer_5/attention/output/dense/BiasAdd', 'bert/encoder/layer_5/attention/output/add', 'bert/encoder/layer_5/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_5/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_5/attention/output/LayerNorm/moments/SquaredDifference__379', 'bert/encoder/layer_5/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/Rsqrt__381', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_5/intermediate/dense/MatMul', 'bert/encoder/layer_5/intermediate/dense/BiasAdd', 'bert/encoder/layer_5/intermediate/dense/Pow', 'bert/encoder/layer_5/intermediate/dense/mul', 'bert/encoder/layer_5/intermediate/dense/add', 'bert/encoder/layer_5/intermediate/dense/mul_1', 'bert/encoder/layer_5/intermediate/dense/Tanh', 'bert/encoder/layer_5/intermediate/dense/add_1', 'bert/encoder/layer_5/intermediate/dense/mul_2', 'bert/encoder/layer_5/intermediate/dense/mul_3', 'bert/encoder/layer_5/output/dense/MatMul', 'bert/encoder/layer_5/output/dense/BiasAdd', 'bert/encoder/layer_5/output/add', 'bert/encoder/layer_5/output/LayerNorm/moments/mean', 'bert/encoder/layer_5/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_5/output/LayerNorm/moments/SquaredDifference__383', 'bert/encoder/layer_5/output/LayerNorm/moments/variance', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/Rsqrt__385', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_6/attention/self/value/MatMul', 'bert/encoder/layer_6/attention/self/value/BiasAdd', 'bert/encoder/layer_6/attention/self/Reshape_2', 'bert/encoder/layer_6/attention/self/transpose_2', 'bert/encoder/layer_6/attention/self/query/MatMul', 'bert/encoder/layer_6/attention/self/query/BiasAdd', 'bert/encoder/layer_6/attention/self/Reshape', 'bert/encoder/layer_6/attention/self/transpose', 'bert/encoder/layer_6/attention/self/key/MatMul', 'bert/encoder/layer_6/attention/self/key/BiasAdd', 'bert/encoder/layer_6/attention/self/Reshape_1', 'bert/encoder/layer_6/attention/self/MatMul__390', 'bert/encoder/layer_6/attention/self/MatMul', 'bert/encoder/layer_6/attention/self/Mul', 'bert/encoder/layer_6/attention/self/add', 'bert/encoder/layer_6/attention/self/Softmax', 'bert/encoder/layer_6/attention/self/MatMul_1', 'bert/encoder/layer_6/attention/self/transpose_3', 'bert/encoder/layer_6/attention/self/Reshape_3', 'bert/encoder/layer_6/attention/output/dense/MatMul', 'bert/encoder/layer_6/attention/output/dense/BiasAdd', 'bert/encoder/layer_6/attention/output/add', 'bert/encoder/layer_6/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_6/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_6/attention/output/LayerNorm/moments/SquaredDifference__393', 'bert/encoder/layer_6/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/Rsqrt__395', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_6/intermediate/dense/MatMul', 'bert/encoder/layer_6/intermediate/dense/BiasAdd', 'bert/encoder/layer_6/intermediate/dense/Pow', 'bert/encoder/layer_6/intermediate/dense/mul', 'bert/encoder/layer_6/intermediate/dense/add', 'bert/encoder/layer_6/intermediate/dense/mul_1', 'bert/encoder/layer_6/intermediate/dense/Tanh', 'bert/encoder/layer_6/intermediate/dense/add_1', 'bert/encoder/layer_6/intermediate/dense/mul_2', 'bert/encoder/layer_6/intermediate/dense/mul_3', 'bert/encoder/layer_6/output/dense/MatMul', 'bert/encoder/layer_6/output/dense/BiasAdd', 'bert/encoder/layer_6/output/add', 'bert/encoder/layer_6/output/LayerNorm/moments/mean', 'bert/encoder/layer_6/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_6/output/LayerNorm/moments/SquaredDifference__397', 'bert/encoder/layer_6/output/LayerNorm/moments/variance', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/Rsqrt__399', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_7/attention/self/value/MatMul', 'bert/encoder/layer_7/attention/self/value/BiasAdd', 'bert/encoder/layer_7/attention/self/Reshape_2', 'bert/encoder/layer_7/attention/self/transpose_2', 'bert/encoder/layer_7/attention/self/query/MatMul', 'bert/encoder/layer_7/attention/self/query/BiasAdd', 'bert/encoder/layer_7/attention/self/Reshape', 'bert/encoder/layer_7/attention/self/transpose', 'bert/encoder/layer_7/attention/self/key/MatMul', 'bert/encoder/layer_7/attention/self/key/BiasAdd', 'bert/encoder/layer_7/attention/self/Reshape_1', 'bert/encoder/layer_7/attention/self/MatMul__404', 'bert/encoder/layer_7/attention/self/MatMul', 'bert/encoder/layer_7/attention/self/Mul', 'bert/encoder/layer_7/attention/self/add', 'bert/encoder/layer_7/attention/self/Softmax', 'bert/encoder/layer_7/attention/self/MatMul_1', 'bert/encoder/layer_7/attention/self/transpose_3', 'bert/encoder/layer_7/attention/self/Reshape_3', 'bert/encoder/layer_7/attention/output/dense/MatMul', 'bert/encoder/layer_7/attention/output/dense/BiasAdd', 'bert/encoder/layer_7/attention/output/add', 'bert/encoder/layer_7/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_7/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_7/attention/output/LayerNorm/moments/SquaredDifference__407', 'bert/encoder/layer_7/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/Rsqrt__409', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_7/intermediate/dense/MatMul', 'bert/encoder/layer_7/intermediate/dense/BiasAdd', 'bert/encoder/layer_7/intermediate/dense/Pow', 'bert/encoder/layer_7/intermediate/dense/mul', 'bert/encoder/layer_7/intermediate/dense/add', 'bert/encoder/layer_7/intermediate/dense/mul_1', 'bert/encoder/layer_7/intermediate/dense/Tanh', 'bert/encoder/layer_7/intermediate/dense/add_1', 'bert/encoder/layer_7/intermediate/dense/mul_2', 'bert/encoder/layer_7/intermediate/dense/mul_3', 'bert/encoder/layer_7/output/dense/MatMul', 'bert/encoder/layer_7/output/dense/BiasAdd', 'bert/encoder/layer_7/output/add', 'bert/encoder/layer_7/output/LayerNorm/moments/mean', 'bert/encoder/layer_7/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_7/output/LayerNorm/moments/SquaredDifference__411', 'bert/encoder/layer_7/output/LayerNorm/moments/variance', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/Rsqrt__413', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_8/attention/self/value/MatMul', 'bert/encoder/layer_8/attention/self/value/BiasAdd', 'bert/encoder/layer_8/attention/self/Reshape_2', 'bert/encoder/layer_8/attention/self/transpose_2', 'bert/encoder/layer_8/attention/self/query/MatMul', 'bert/encoder/layer_8/attention/self/query/BiasAdd', 'bert/encoder/layer_8/attention/self/Reshape', 'bert/encoder/layer_8/attention/self/transpose', 'bert/encoder/layer_8/attention/self/key/MatMul', 'bert/encoder/layer_8/attention/self/key/BiasAdd', 'bert/encoder/layer_8/attention/self/Reshape_1', 'bert/encoder/layer_8/attention/self/MatMul__418', 'bert/encoder/layer_8/attention/self/MatMul', 'bert/encoder/layer_8/attention/self/Mul', 'bert/encoder/layer_8/attention/self/add', 'bert/encoder/layer_8/attention/self/Softmax', 'bert/encoder/layer_8/attention/self/MatMul_1', 'bert/encoder/layer_8/attention/self/transpose_3', 'bert/encoder/layer_8/attention/self/Reshape_3', 'bert/encoder/layer_8/attention/output/dense/MatMul', 'bert/encoder/layer_8/attention/output/dense/BiasAdd', 'bert/encoder/layer_8/attention/output/add', 'bert/encoder/layer_8/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_8/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_8/attention/output/LayerNorm/moments/SquaredDifference__421', 'bert/encoder/layer_8/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/Rsqrt__423', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_8/intermediate/dense/MatMul', 'bert/encoder/layer_8/intermediate/dense/BiasAdd', 'bert/encoder/layer_8/intermediate/dense/Pow', 'bert/encoder/layer_8/intermediate/dense/mul', 'bert/encoder/layer_8/intermediate/dense/add', 'bert/encoder/layer_8/intermediate/dense/mul_1', 'bert/encoder/layer_8/intermediate/dense/Tanh', 'bert/encoder/layer_8/intermediate/dense/add_1', 'bert/encoder/layer_8/intermediate/dense/mul_2', 'bert/encoder/layer_8/intermediate/dense/mul_3', 'bert/encoder/layer_8/output/dense/MatMul', 'bert/encoder/layer_8/output/dense/BiasAdd', 'bert/encoder/layer_8/output/add', 'bert/encoder/layer_8/output/LayerNorm/moments/mean', 'bert/encoder/layer_8/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_8/output/LayerNorm/moments/SquaredDifference__425', 'bert/encoder/layer_8/output/LayerNorm/moments/variance', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/Rsqrt__427', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_9/attention/self/value/MatMul', 'bert/encoder/layer_9/attention/self/value/BiasAdd', 'bert/encoder/layer_9/attention/self/Reshape_2', 'bert/encoder/layer_9/attention/self/transpose_2', 'bert/encoder/layer_9/attention/self/query/MatMul', 'bert/encoder/layer_9/attention/self/query/BiasAdd', 'bert/encoder/layer_9/attention/self/Reshape', 'bert/encoder/layer_9/attention/self/transpose', 'bert/encoder/layer_9/attention/self/key/MatMul', 'bert/encoder/layer_9/attention/self/key/BiasAdd', 'bert/encoder/layer_9/attention/self/Reshape_1', 'bert/encoder/layer_9/attention/self/MatMul__432', 'bert/encoder/layer_9/attention/self/MatMul', 'bert/encoder/layer_9/attention/self/Mul', 'bert/encoder/layer_9/attention/self/add', 'bert/encoder/layer_9/attention/self/Softmax', 'bert/encoder/layer_9/attention/self/MatMul_1', 'bert/encoder/layer_9/attention/self/transpose_3', 'bert/encoder/layer_9/attention/self/Reshape_3', 'bert/encoder/layer_9/attention/output/dense/MatMul', 'bert/encoder/layer_9/attention/output/dense/BiasAdd', 'bert/encoder/layer_9/attention/output/add', 'bert/encoder/layer_9/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_9/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_9/attention/output/LayerNorm/moments/SquaredDifference__435', 'bert/encoder/layer_9/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/Rsqrt__437', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_9/intermediate/dense/MatMul', 'bert/encoder/layer_9/intermediate/dense/BiasAdd', 'bert/encoder/layer_9/intermediate/dense/Pow', 'bert/encoder/layer_9/intermediate/dense/mul', 'bert/encoder/layer_9/intermediate/dense/add', 'bert/encoder/layer_9/intermediate/dense/mul_1', 'bert/encoder/layer_9/intermediate/dense/Tanh', 'bert/encoder/layer_9/intermediate/dense/add_1', 'bert/encoder/layer_9/intermediate/dense/mul_2', 'bert/encoder/layer_9/intermediate/dense/mul_3', 'bert/encoder/layer_9/output/dense/MatMul', 'bert/encoder/layer_9/output/dense/BiasAdd', 'bert/encoder/layer_9/output/add', 'bert/encoder/layer_9/output/LayerNorm/moments/mean', 'bert/encoder/layer_9/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_9/output/LayerNorm/moments/SquaredDifference__439', 'bert/encoder/layer_9/output/LayerNorm/moments/variance', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/Rsqrt__441', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_10/attention/self/value/MatMul', 'bert/encoder/layer_10/attention/self/value/BiasAdd', 'bert/encoder/layer_10/attention/self/Reshape_2', 'bert/encoder/layer_10/attention/self/transpose_2', 'bert/encoder/layer_10/attention/self/query/MatMul', 'bert/encoder/layer_10/attention/self/query/BiasAdd', 'bert/encoder/layer_10/attention/self/Reshape', 'bert/encoder/layer_10/attention/self/transpose', 'bert/encoder/layer_10/attention/self/key/MatMul', 'bert/encoder/layer_10/attention/self/key/BiasAdd', 'bert/encoder/layer_10/attention/self/Reshape_1', 'bert/encoder/layer_10/attention/self/MatMul__446', 'bert/encoder/layer_10/attention/self/MatMul', 'bert/encoder/layer_10/attention/self/Mul', 'bert/encoder/layer_10/attention/self/add', 'bert/encoder/layer_10/attention/self/Softmax', 'bert/encoder/layer_10/attention/self/MatMul_1', 'bert/encoder/layer_10/attention/self/transpose_3', 'bert/encoder/layer_10/attention/self/Reshape_3', 'bert/encoder/layer_10/attention/output/dense/MatMul', 'bert/encoder/layer_10/attention/output/dense/BiasAdd', 'bert/encoder/layer_10/attention/output/add', 'bert/encoder/layer_10/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_10/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_10/attention/output/LayerNorm/moments/SquaredDifference__449', 'bert/encoder/layer_10/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/Rsqrt__451', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_10/intermediate/dense/MatMul', 'bert/encoder/layer_10/intermediate/dense/BiasAdd', 'bert/encoder/layer_10/intermediate/dense/Pow', 'bert/encoder/layer_10/intermediate/dense/mul', 'bert/encoder/layer_10/intermediate/dense/add', 'bert/encoder/layer_10/intermediate/dense/mul_1', 'bert/encoder/layer_10/intermediate/dense/Tanh', 'bert/encoder/layer_10/intermediate/dense/add_1', 'bert/encoder/layer_10/intermediate/dense/mul_2', 'bert/encoder/layer_10/intermediate/dense/mul_3', 'bert/encoder/layer_10/output/dense/MatMul', 'bert/encoder/layer_10/output/dense/BiasAdd', 'bert/encoder/layer_10/output/add', 'bert/encoder/layer_10/output/LayerNorm/moments/mean', 'bert/encoder/layer_10/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_10/output/LayerNorm/moments/SquaredDifference__453', 'bert/encoder/layer_10/output/LayerNorm/moments/variance', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/Rsqrt__455', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_11/attention/self/value/MatMul', 'bert/encoder/layer_11/attention/self/value/BiasAdd', 'bert/encoder/layer_11/attention/self/Reshape_2', 'bert/encoder/layer_11/attention/self/transpose_2', 'bert/encoder/layer_11/attention/self/query/MatMul', 'bert/encoder/layer_11/attention/self/query/BiasAdd', 'bert/encoder/layer_11/attention/self/Reshape', 'bert/encoder/layer_11/attention/self/transpose', 'bert/encoder/layer_11/attention/self/key/MatMul', 'bert/encoder/layer_11/attention/self/key/BiasAdd', 'bert/encoder/layer_11/attention/self/Reshape_1', 'bert/encoder/layer_11/attention/self/MatMul__460', 'bert/encoder/layer_11/attention/self/MatMul', 'bert/encoder/layer_11/attention/self/Mul', 'bert/encoder/layer_11/attention/self/add', 'bert/encoder/layer_11/attention/self/Softmax', 'bert/encoder/layer_11/attention/self/MatMul_1', 'bert/encoder/layer_11/attention/self/transpose_3', 'bert/encoder/layer_11/attention/self/Reshape_3', 'bert/encoder/layer_11/attention/output/dense/MatMul', 'bert/encoder/layer_11/attention/output/dense/BiasAdd', 'bert/encoder/layer_11/attention/output/add', 'bert/encoder/layer_11/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_11/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_11/attention/output/LayerNorm/moments/SquaredDifference__463', 'bert/encoder/layer_11/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/Rsqrt__465', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_11/intermediate/dense/MatMul', 'bert/encoder/layer_11/intermediate/dense/BiasAdd', 'bert/encoder/layer_11/intermediate/dense/Pow', 'bert/encoder/layer_11/intermediate/dense/mul', 'bert/encoder/layer_11/intermediate/dense/add', 'bert/encoder/layer_11/intermediate/dense/mul_1', 'bert/encoder/layer_11/intermediate/dense/Tanh', 'bert/encoder/layer_11/intermediate/dense/add_1', 'bert/encoder/layer_11/intermediate/dense/mul_2', 'bert/encoder/layer_11/intermediate/dense/mul_3', 'bert/encoder/layer_11/output/dense/MatMul', 'bert/encoder/layer_11/output/dense/BiasAdd', 'bert/encoder/layer_11/output/add', 'bert/encoder/layer_11/output/LayerNorm/moments/mean', 'bert/encoder/layer_11/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_11/output/LayerNorm/moments/SquaredDifference__467', 'bert/encoder/layer_11/output/LayerNorm/moments/variance', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/Rsqrt__469', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/add_1', 'MatMul', 'BiasAdd', 'Reshape_1', 'transpose', 'unstack', 'unstack__490', 'unstack__488']
Generating '/tmp/nsys-report-f0a2.qdstrm'

[1/8] [0%                          ] nsys-report-571e.nsys-rep
[1/8] [0%                          ] nsys-report-571e.nsys-rep
[1/8] [5%                          ] nsys-report-571e.nsys-rep
[1/8] [8%                          ] nsys-report-571e.nsys-rep
[1/8] [9%                          ] nsys-report-571e.nsys-rep
[1/8] [0%                          ] nsys-report-571e.nsys-rep
[1/8] [=17%                        ] nsys-report-571e.nsys-rep
[1/8] [=16%                        ] nsys-report-571e.nsys-rep
[1/8] [=17%                        ] nsys-report-571e.nsys-rep
[1/8] [==18%                       ] nsys-report-571e.nsys-rep
[1/8] [===22%                      ] nsys-report-571e.nsys-rep
[1/8] [===23%                      ] nsys-report-571e.nsys-rep
[1/8] [===24%                      ] nsys-report-571e.nsys-rep
[1/8] [====25%                     ] nsys-report-571e.nsys-rep
[1/8] [====26%                     ] nsys-report-571e.nsys-rep
[1/8] [====27%                     ] nsys-report-571e.nsys-rep
[1/8] [====28%                     ] nsys-report-571e.nsys-rep
[1/8] [=====29%                    ] nsys-report-571e.nsys-rep
[1/8] [=====30%                    ] nsys-report-571e.nsys-rep
[1/8] [=====31%                    ] nsys-report-571e.nsys-rep
[1/8] [=====32%                    ] nsys-report-571e.nsys-rep
[1/8] [======33%                   ] nsys-report-571e.nsys-rep
[1/8] [=======36%                  ] nsys-report-571e.nsys-rep
[1/8] [=======39%                  ] nsys-report-571e.nsys-rep
[1/8] [========40%                 ] nsys-report-571e.nsys-rep
[1/8] [========42%                 ] nsys-report-571e.nsys-rep
[1/8] [===========50%              ] nsys-report-571e.nsys-rep
[1/8] [===========51%              ] nsys-report-571e.nsys-rep
[1/8] [=============60%            ] nsys-report-571e.nsys-rep
[1/8] [================68%         ] nsys-report-571e.nsys-rep
[1/8] [==================77%       ] nsys-report-571e.nsys-rep
[1/8] [====================85%     ] nsys-report-571e.nsys-rep
[1/8] [=======================94%  ] nsys-report-571e.nsys-rep
[1/8] [========================98% ] nsys-report-571e.nsys-rep
[1/8] [========================100%] nsys-report-571e.nsys-rep
[1/8] [========================100%] nsys-report-571e.nsys-rep

[2/8] [0%                          ] nsys-report-61d4.sqlite
[2/8] [1%                          ] nsys-report-61d4.sqlite
[2/8] [2%                          ] nsys-report-61d4.sqlite
[2/8] [3%                          ] nsys-report-61d4.sqlite
[2/8] [4%                          ] nsys-report-61d4.sqlite
[2/8] [5%                          ] nsys-report-61d4.sqlite
[2/8] [6%                          ] nsys-report-61d4.sqlite
[2/8] [7%                          ] nsys-report-61d4.sqlite
[2/8] [8%                          ] nsys-report-61d4.sqlite
[2/8] [9%                          ] nsys-report-61d4.sqlite
[2/8] [10%                         ] nsys-report-61d4.sqlite
[2/8] [11%                         ] nsys-report-61d4.sqlite
[2/8] [12%                         ] nsys-report-61d4.sqlite
[2/8] [13%                         ] nsys-report-61d4.sqlite
[2/8] [14%                         ] nsys-report-61d4.sqlite
[2/8] [=15%                        ] nsys-report-61d4.sqlite
[2/8] [=16%                        ] nsys-report-61d4.sqlite
[2/8] [=17%                        ] nsys-report-61d4.sqlite
[2/8] [==18%                       ] nsys-report-61d4.sqlite
[2/8] [==19%                       ] nsys-report-61d4.sqlite
[2/8] [==20%                       ] nsys-report-61d4.sqlite
[2/8] [==21%                       ] nsys-report-61d4.sqlite
[2/8] [===22%                      ] nsys-report-61d4.sqlite
[2/8] [===23%                      ] nsys-report-61d4.sqlite
[2/8] [===24%                      ] nsys-report-61d4.sqlite
[2/8] [====25%                     ] nsys-report-61d4.sqlite
[2/8] [====26%                     ] nsys-report-61d4.sqlite
[2/8] [====27%                     ] nsys-report-61d4.sqlite
[2/8] [====28%                     ] nsys-report-61d4.sqlite
[2/8] [=====29%                    ] nsys-report-61d4.sqlite
[2/8] [=====30%                    ] nsys-report-61d4.sqlite
[2/8] [=====31%                    ] nsys-report-61d4.sqlite
[2/8] [=====32%                    ] nsys-report-61d4.sqlite
[2/8] [======33%                   ] nsys-report-61d4.sqlite
[2/8] [======34%                   ] nsys-report-61d4.sqlite
[2/8] [======35%                   ] nsys-report-61d4.sqlite
[2/8] [=======36%                  ] nsys-report-61d4.sqlite
[2/8] [=======37%                  ] nsys-report-61d4.sqlite
[2/8] [=======38%                  ] nsys-report-61d4.sqlite
[2/8] [=======39%                  ] nsys-report-61d4.sqlite
[2/8] [========40%                 ] nsys-report-61d4.sqlite
[2/8] [========41%                 ] nsys-report-61d4.sqlite
[2/8] [========42%                 ] nsys-report-61d4.sqlite
[2/8] [=========43%                ] nsys-report-61d4.sqlite
[2/8] [=========44%                ] nsys-report-61d4.sqlite
[2/8] [=========45%                ] nsys-report-61d4.sqlite
[2/8] [=========46%                ] nsys-report-61d4.sqlite
[2/8] [==========47%               ] nsys-report-61d4.sqlite
[2/8] [==========48%               ] nsys-report-61d4.sqlite
[2/8] [==========49%               ] nsys-report-61d4.sqlite
[2/8] [===========50%              ] nsys-report-61d4.sqlite
[2/8] [===========51%              ] nsys-report-61d4.sqlite
[2/8] [===========52%              ] nsys-report-61d4.sqlite
[2/8] [===========53%              ] nsys-report-61d4.sqlite
[2/8] [============54%             ] nsys-report-61d4.sqlite
[2/8] [============55%             ] nsys-report-61d4.sqlite
[2/8] [============56%             ] nsys-report-61d4.sqlite
[2/8] [============57%             ] nsys-report-61d4.sqlite
[2/8] [=============58%            ] nsys-report-61d4.sqlite
[2/8] [=============59%            ] nsys-report-61d4.sqlite
[2/8] [=============60%            ] nsys-report-61d4.sqlite
[2/8] [==============61%           ] nsys-report-61d4.sqlite
[2/8] [==============62%           ] nsys-report-61d4.sqlite
[2/8] [==============63%           ] nsys-report-61d4.sqlite
[2/8] [==============64%           ] nsys-report-61d4.sqlite
[2/8] [===============65%          ] nsys-report-61d4.sqlite
[2/8] [===============66%          ] nsys-report-61d4.sqlite
[2/8] [===============67%          ] nsys-report-61d4.sqlite
[2/8] [================68%         ] nsys-report-61d4.sqlite
[2/8] [================69%         ] nsys-report-61d4.sqlite
[2/8] [================70%         ] nsys-report-61d4.sqlite
[2/8] [================71%         ] nsys-report-61d4.sqlite
[2/8] [=================72%        ] nsys-report-61d4.sqlite
[2/8] [=================73%        ] nsys-report-61d4.sqlite
[2/8] [=================74%        ] nsys-report-61d4.sqlite
[2/8] [==================75%       ] nsys-report-61d4.sqlite
[2/8] [==================76%       ] nsys-report-61d4.sqlite
[2/8] [==================77%       ] nsys-report-61d4.sqlite
[2/8] [==================78%       ] nsys-report-61d4.sqlite
[2/8] [===================79%      ] nsys-report-61d4.sqlite
[2/8] [===================80%      ] nsys-report-61d4.sqlite
[2/8] [===================81%      ] nsys-report-61d4.sqlite
[2/8] [===================82%      ] nsys-report-61d4.sqlite
[2/8] [====================83%     ] nsys-report-61d4.sqlite
[2/8] [====================84%     ] nsys-report-61d4.sqlite
[2/8] [====================85%     ] nsys-report-61d4.sqlite
[2/8] [=====================86%    ] nsys-report-61d4.sqlite
[2/8] [=====================87%    ] nsys-report-61d4.sqlite
[2/8] [=====================88%    ] nsys-report-61d4.sqlite
[2/8] [=====================89%    ] nsys-report-61d4.sqlite
[2/8] [======================90%   ] nsys-report-61d4.sqlite
[2/8] [======================91%   ] nsys-report-61d4.sqlite
[2/8] [======================92%   ] nsys-report-61d4.sqlite
[2/8] [=======================93%  ] nsys-report-61d4.sqlite
[2/8] [=======================94%  ] nsys-report-61d4.sqlite
[2/8] [=======================95%  ] nsys-report-61d4.sqlite
[2/8] [=======================96%  ] nsys-report-61d4.sqlite
[2/8] [========================97% ] nsys-report-61d4.sqlite
[2/8] [========================98% ] nsys-report-61d4.sqlite
[2/8] [========================99% ] nsys-report-61d4.sqlite
[2/8] [========================100%] nsys-report-61d4.sqlite
[2/8] [========================100%] nsys-report-61d4.sqlite
[3/8] Executing 'nvtx_sum' stats report
[4/8] Executing 'osrt_sum' stats report

 Time (%)  Total Time (ns)  Num Calls    Avg (ns)       Med (ns)      Min (ns)      Max (ns)     StdDev (ns)            Name         
 --------  ---------------  ---------  -------------  -------------  -----------  -------------  ------------  ----------------------
     45.7    6,432,425,876         75   85,765,678.3  100,148,510.0        1,131    541,513,176  66,972,862.5  poll                  
     39.1    5,502,111,747         11  500,191,977.0  500,091,666.0  500,064,476    500,420,812     149,209.3  pthread_cond_timedwait
     14.5    2,042,489,720      5,642      362,015.2        1,070.0          289  1,254,612,700  16,703,631.1  read                  
      0.4       58,720,560      1,956       30,020.7        8,750.0          209     10,802,610     321,096.5  ioctl                 
      0.1       10,278,541      3,183        3,229.2        3,030.0        1,140         38,491       1,489.3  open64                
      0.0        5,062,699          1    5,062,699.0    5,062,699.0    5,062,699      5,062,699           0.0  nanosleep             
      0.0        4,719,085         13      363,006.5       59,941.0       55,530      3,943,222   1,075,769.4  sleep                 
      0.0        3,766,705    131,629           28.6           30.0           20          7,130          42.7  pthread_cond_signal   
      0.0        3,163,169        138       22,921.5        6,209.5        2,680      1,600,635     136,671.4  mmap64                
      0.0          925,766         71       13,039.0          780.0          560        589,719      71,844.6  pread64               
      0.0          627,249        583        1,075.9           60.0           20         68,461       6,293.8  fgets                 
      0.0          519,503         28       18,553.7       10,090.0        1,950        100,281      23,969.3  mmap                  
      0.0          519,438         10       51,943.8       53,621.0       39,950         66,231       9,756.9  sem_timedwait         
      0.0          372,085          8       46,510.6       39,130.5       29,550         68,721      15,516.6  pthread_create        
      0.0          229,373         29        7,909.4        2,660.0          510         53,831      12,015.8  write                 
      0.0          212,594         44        4,831.7        2,660.0          960         29,991       5,224.3  fopen                 
      0.0          175,133         10       17,513.3        4,395.0        2,060         72,471      27,997.5  munmap                
      0.0          137,962          1      137,962.0      137,962.0      137,962        137,962           0.0  pthread_cond_wait     
      0.0           73,600         15        4,906.7        3,600.0        1,800         24,230       5,506.1  open                  
      0.0           70,141          1       70,141.0       70,141.0       70,141         70,141           0.0  waitpid               
      0.0           67,089         41        1,636.3        1,150.0          660          9,470       1,507.4  fclose                
      0.0           61,480      1,622           37.9           30.0           20          4,520         139.5  pthread_cond_broadcast
      0.0           47,891          2       23,945.5       23,945.5        9,740         38,151      20,089.6  connect               
      0.0           36,161          6        6,026.8        5,765.0        1,990         10,950       3,328.6  fopen64               
      0.0           31,760        133          238.8          230.0           20          1,720         177.7  sigaction             
      0.0           30,051          6        5,008.5        4,375.0        2,020          9,390       2,818.8  pipe2                 
      0.0           30,050          4        7,512.5        7,730.0        3,390         11,200       3,708.9  socket                
      0.0           23,246         68          341.9          300.0          170          1,200         197.7  fcntl                 
      0.0           21,022        541           38.9           40.0           29            930          38.8  flockfile             
      0.0           20,870        256           81.5           30.0           20            620          79.9  pthread_mutex_trylock 
      0.0           16,121          3        5,373.7        5,240.0        3,690          7,191       1,754.3  fread                 
      0.0            7,630          2        3,815.0        3,815.0        1,640          5,990       3,075.9  bind                  
      0.0            3,310          2        1,655.0        1,655.0          990          2,320         940.5  fwrite                
      0.0            3,300         10          330.0          270.0          200            550         125.2  dup                   
      0.0            2,540         30           84.7           30.0           20            690         143.0  fflush                
      0.0            2,020          1        2,020.0        2,020.0        2,020          2,020           0.0  getc                  
      0.0            1,010          2          505.0          505.0          240            770         374.8  dup2                  
      0.0              650          1          650.0          650.0          650            650           0.0  listen                

[5/8] Executing 'cuda_api_sum' stats report

 Time (%)  Total Time (ns)  Num Calls   Avg (ns)     Med (ns)    Min (ns)    Max (ns)   StdDev (ns)                      Name                     
 --------  ---------------  ---------  -----------  -----------  ---------  ----------  -----------  ---------------------------------------------
     57.9      242,878,568      1,676    144,915.6     29,675.5      2,300   2,906,016    301,718.2  cudaMemcpyAsync                              
     19.8       82,993,996      1,676     49,519.1     10,610.5        630     252,644     69,376.8  cudaStreamSynchronize                        
     17.7       74,149,494        644    115,139.0      7,840.5      3,620  15,954,210    972,290.5  cudaLaunchKernel                             
      1.7        7,210,163          2  3,605,081.5  3,605,081.5  1,087,907   6,122,256  3,559,822.3  cudaFree                                     
      1.7        6,974,859          9    774,984.3      1,760.0        310   6,961,489  2,319,939.3  cudaStreamIsCapturing_v10000                 
      0.6        2,317,668         49     47,299.3     47,240.0     40,531      50,931      2,786.7  cuCtxSynchronize                             
      0.2        1,045,127          9    116,125.2    141,092.0      1,980     204,283     67,750.9  cudaMalloc                                   
      0.2          732,474         49     14,948.4     14,840.0      8,910      26,710      2,883.1  cuLaunchKernel                               
      0.1          382,191         62      6,164.4      4,000.0      3,051      24,071      4,022.1  cudaMemsetAsync                              
      0.1          340,324      1,532        222.1        190.0         50       3,290        154.9  cuGetProcAddress_v2                          
      0.0          189,144          2     94,572.0     94,572.0     83,002     106,142     16,362.5  cuModuleLoadData                             
      0.0          163,282          1    163,282.0    163,282.0    163,282     163,282          0.0  cudaGetDeviceProperties_v2_v12000            
      0.0           92,013         26      3,539.0      3,480.0        361       7,220      1,657.8  cudaOccupancyMaxActiveBlocksPerMultiprocessor
      0.0           18,060         18      1,003.3        255.0        160      11,380      2,622.5  cudaEventCreateWithFlags                     
      0.0            4,420          4      1,105.0      1,110.0        480       1,720        507.9  cuInit                                       
      0.0            3,350          1      3,350.0      3,350.0      3,350       3,350          0.0  cuMemFree_v2                                 
      0.0            1,150          2        575.0        575.0        220         930        502.0  cudaGetDriverEntryPoint_v11030               
      0.0              990          1        990.0        990.0        990         990          0.0  cuCtxSetCurrent                              
      0.0              830          4        207.5        240.0         70         280         93.9  cuModuleGetLoadingMode                       

[6/8] Executing 'cuda_gpu_kern_sum' stats report

 Time (%)  Total Time (ns)  Instances  Avg (ns)  Med (ns)  Min (ns)  Max (ns)  StdDev (ns)                                                  Name                                                
 --------  ---------------  ---------  --------  --------  --------  --------  -----------  ----------------------------------------------------------------------------------------------------
     48.6        2,148,688         49  43,850.8  43,776.0    43,648    45,920        330.0  cutlass_tensorop_f16_s16816gemm_f16_256x128_32x3_tt_align8                                          
      7.4          325,953         48   6,790.7   6,720.0     6,528     7,104        180.6  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      6.8          299,073         12  24,922.8  24,896.0    24,832    25,248        110.7  ampere_fp16_s16816gemm_fp16_128x64_ldg8_f2f_stages_64x4_nn                                          
      4.6          204,196         72   2,836.1   2,880.0     2,336     3,392        382.1  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      4.0          175,554         48   3,657.4   3,728.0     3,424     3,904        228.3  void at::native::reduce_kernel<(int)512, (int)1, at::native::ReduceOp<c10::Half, at::native::MeanOp…
      3.3          143,904         12  11,992.0  11,936.0    11,903    12,512        168.9  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      2.9          126,590         24   5,274.6   5,344.0     4,768     5,664        285.1  void cutlass::Kernel<cutlass_80_wmma_tensorop_f16_s161616gemm_f16_32x32_64x1_nn_align8>(T1::Params) 
      2.3          103,104         48   2,148.0   2,176.0     1,984     2,368        136.9  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      2.3          100,705         36   2,797.4   2,240.0     2,208     3,968        807.3  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      2.2           96,416         12   8,034.7   8,032.0     7,968     8,096         42.0  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      2.2           95,072         12   7,922.7   7,936.0     7,776     8,032         64.7  ampere_fp16_s16816gemm_fp16_64x64_ldg8_f2f_stages_64x5_nn                                           
      1.9           84,607         12   7,050.6   7,072.0     6,816     7,104         77.6  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      1.5           65,825         36   1,828.5   1,504.0     1,472     2,560        498.2  void at::native::vectorized_elementwise_kernel<(int)4, at::native::CUDAFunctor_add<c10::Half>, at::…
      1.5           65,344         27   2,420.1   2,432.0     1,824     2,592        136.4  void at::native::elementwise_kernel<(int)128, (int)2, void at::native::gpu_kernel_impl<at::native::…
      1.5           65,282         36   1,813.4   1,472.0     1,440     2,560        505.3  void at::native::vectorized_elementwise_kernel<(int)4, at::native::BinaryFunctor<c10::Half, c10::Ha…
      1.5           64,640         12   5,386.7   5,328.0     5,216     6,113        239.5  void <unnamed>::softmax_warp_forward<float, float, float, (int)8, (bool)0, (bool)0>(T2 *, const T1 …
      1.4           61,664         49   1,258.4   1,088.0       800     1,888        339.4  void at::native::vectorized_elementwise_kernel<(int)4, at::native::FillFunctor<c10::Half>, at::deta…
      0.7           31,390         24   1,307.9   1,312.0     1,184     1,344         27.2  void at::native::unrolled_elementwise_kernel<at::native::CUDAFunctor_add<c10::Half>, at::detail::Ar…
      0.7           31,200         12   2,600.0   2,560.0     2,528     3,040        139.2  void at::native::vectorized_elementwise_kernel<(int)4, at::native::tanh_kernel_cuda(at::TensorItera…
      0.6           27,426         27   1,015.8     992.0       928     1,824        162.3  void at::native::vectorized_elementwise_kernel<(int)4, at::native::CUDAFunctor_add<float>, at::deta…
      0.6           27,102         24   1,129.3   1,120.0     1,056     1,152         22.1  void at::native::vectorized_elementwise_kernel<(int)4, at::native::reciprocal_kernel_cuda(at::Tenso…
      0.6           26,304         24   1,096.0   1,088.0     1,056     1,120         17.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::sqrt_kernel_cuda(at::TensorItera…
      0.5           22,240         24     926.7     928.0       896       929          6.5  void at::native::vectorized_elementwise_kernel<(int)4, at::native::AUnaryFunctor<c10::Half, c10::Ha…
      0.2            9,408          5   1,881.6   1,664.0     1,344     2,528        519.6  void at::native::elementwise_kernel<(int)128, (int)2, void at::native::gpu_kernel_impl<at::native::…
      0.2            7,775          2   3,887.5   3,887.5     3,744     4,031        202.9  void at::native::reduce_kernel<(int)512, (int)1, at::native::ReduceOp<float, at::native::MeanOps<fl…
      0.1            2,656          1   2,656.0   2,656.0     2,656     2,656          0.0  void at::native::unrolled_elementwise_kernel<at::native::CUDAFunctor_add<float>, at::detail::Array<…
      0.0            2,144          1   2,144.0   2,144.0     2,144     2,144          0.0  void cutlass::Kernel<cutlass_80_wmma_tensorop_f16_s161616gemm_f16_32x32_32x1_nn_align2>(T1::Params) 
      0.0            1,856          1   1,856.0   1,856.0     1,856     1,856          0.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::BinaryFunctor<float, float, floa…
      0.0            1,024          1   1,024.0   1,024.0     1,024     1,024          0.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::reciprocal_kernel_cuda(at::Tenso…
      0.0            1,024          1   1,024.0   1,024.0     1,024     1,024          0.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::sqrt_kernel_cuda(at::TensorItera…
      0.0              896          1     896.0     896.0       896       896          0.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::AUnaryFunctor<float, float, floa…

[7/8] Executing 'cuda_gpu_mem_time_sum' stats report

 Time (%)  Total Time (ns)  Count  Avg (ns)   Med (ns)  Min (ns)  Max (ns)   StdDev (ns)           Operation          
 --------  ---------------  -----  ---------  --------  --------  ---------  -----------  ----------------------------
     58.7       96,445,173  1,092   88,319.8  61,121.0       287    712,771    134,050.2  [CUDA memcpy Host-to-Device]
     41.3       67,739,236    584  115,991.8  59,264.0       928  1,231,205    172,934.3  [CUDA memcpy Device-to-Host]
      0.0           27,361     62      441.3     320.0       287      1,216        238.9  [CUDA memset]               

[8/8] Executing 'cuda_gpu_mem_size_sum' stats report

 Total (MB)  Count  Avg (MB)  Med (MB)  Min (MB)  Max (MB)  StdDev (MB)           Operation          
 ----------  -----  --------  --------  --------  --------  -----------  ----------------------------
    626.375  1,092     0.574     0.393     0.000     4.719        0.881  [CUDA memcpy Host-to-Device]
    393.055    584     0.673     0.393     0.000     3.146        0.785  [CUDA memcpy Device-to-Host]
      0.001     62     0.000     0.000     0.000     0.000        0.000  [CUDA memset]               

Generated:
    /tmp/nsys-report-571e.nsys-rep
    /tmp/nsys-report-61d4.sqlite
Editor is loading...
Leave a Comment