Untitled

 avatar
unknown
c_cpp
a month ago
146 kB
1
Indexable
Model Input Name: unique_ids_raw_output___9:0, Shape: [0]
Model Input Name: segment_ids:0, Shape: [0, 256]
Model Input Name: input_mask:0, Shape: [0, 256]
Model Input Name: input_ids:0, Shape: [0, 256]
Starting model execution...

Inputs Details:
Input Name: input_ids:0
Shape: (1, 256)
Data (first 10 values): [28148 26736   988  2528  1043  3349  8281 12138  2763 27770]...
--------------------------------------------------
Input Name: segment_ids:0
Shape: (1, 256)
Data (first 10 values): [0 1 0 0 0 0 1 0 1 0]...
--------------------------------------------------
Input Name: input_mask:0
Shape: (1, 256)
Data (first 10 values): [1 1 0 0 1 0 1 0 1 1]...
--------------------------------------------------
Input Name: unique_ids_raw_output___9:0
Shape: (0,)
Data (first 10 values): []...
--------------------------------------------------
No Add node related to MatMul output: bert/embeddings/MatMul. Executing regular MatMul.
Fusing MatMul with Add for Node: bert/encoder/layer_0/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_0/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_0/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_0/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_0/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_0/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_0/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_0/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_0/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_0/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_0/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_1/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_1/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_1/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_1/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_1/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_1/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_1/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_1/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_1/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_1/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_1/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_2/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_2/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_2/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_2/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_2/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_2/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_2/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_2/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_2/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_2/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_2/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_3/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_3/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_3/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_3/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_3/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_3/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_3/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_3/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_3/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_3/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_3/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_4/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_4/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_4/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_4/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_4/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_4/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_4/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_4/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_4/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_4/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_4/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_5/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_5/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_5/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_5/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_5/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_5/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_5/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_5/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_5/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_5/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_5/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_6/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_6/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_6/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_6/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_6/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_6/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_6/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_6/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_6/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_6/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_6/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_7/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_7/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_7/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_7/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_7/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_7/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_7/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_7/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_7/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_7/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_7/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_8/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_8/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_8/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_8/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_8/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_8/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_8/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_8/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_8/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_8/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_8/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_9/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_9/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_9/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_9/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_9/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_9/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_9/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_9/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_9/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_9/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_9/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_10/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_10/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_10/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_10/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_10/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_10/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_10/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_10/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_10/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_10/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_10/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_11/attention/self/value/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/attention/self/value/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_11/attention/self/query/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/attention/self/query/BiasAdd
Fusing MatMul with Add for Node: bert/encoder/layer_11/attention/self/key/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/attention/self/key/BiasAdd
No Add node related to MatMul output: bert/encoder/layer_11/attention/self/MatMul. Executing regular MatMul.
No Add node related to MatMul output: bert/encoder/layer_11/attention/self/MatMul_1. Executing regular MatMul.
Fusing MatMul with 2Add for Node: bert/encoder/layer_11/attention/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/attention/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_11/attention/output/add
Fusing MatMul with Add for Node: bert/encoder/layer_11/intermediate/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/intermediate/dense/BiasAdd
Fusing MatMul with 2Add for Node: bert/encoder/layer_11/output/dense/MatMul
Skipping already processed Add Node: bert/encoder/layer_11/output/dense/BiasAdd
Skipping already processed Add Node: bert/encoder/layer_11/output/add
Fusing MatMul with Add for Node: MatMul
Skipping already processed Add Node: BiasAdd

Node Execution Times:
Node: unique_ids_graph_outputs_Identity__10, Execution Time: 0.000005 seconds
Node: bert/encoder/Shape, Execution Time: 0.000004 seconds
Node: bert/encoder/Shape__12, Execution Time: 0.000009 seconds
Node: bert/encoder/strided_slice, Execution Time: 0.000062 seconds
Node: bert/encoder/strided_slice__16, Execution Time: 0.000005 seconds
Node: bert/encoder/strided_slice__17, Execution Time: 0.000005 seconds
Node: bert/encoder/ones/packed_Unsqueeze__18, Execution Time: 0.000010 seconds
Node: bert/encoder/ones/packed_Concat__21, Execution Time: 0.000010 seconds
Node: bert/encoder/ones__22, Execution Time: 0.000004 seconds
Node: bert/encoder/ones, Execution Time: 0.000011 seconds
Node: bert/encoder/Reshape, Execution Time: 0.000006 seconds
Node: bert/encoder/Cast, Execution Time: 0.000004 seconds
Node: bert/encoder/mul, Execution Time: 0.028335 seconds
Node: bert/encoder/layer_9/attention/self/ExpandDims, Execution Time: 0.000038 seconds
Node: bert/encoder/layer_9/attention/self/sub, Execution Time: 0.006912 seconds
Node: bert/encoder/layer_9/attention/self/mul_1, Execution Time: 0.000315 seconds
Node: bert/embeddings/Reshape_2, Execution Time: 0.000020 seconds
Node: bert/embeddings/Reshape, Execution Time: 0.000006 seconds
Node: bert/embeddings/GatherV2, Execution Time: 0.000304 seconds
Node: bert/embeddings/Reshape_1, Execution Time: 0.000008 seconds
Node: bert/embeddings/one_hot, Execution Time: 0.000057 seconds
Node: bert/embeddings/MatMul, Execution Time: 0.061418 seconds
Node: bert/embeddings/Reshape_3, Execution Time: 0.000040 seconds
Node: bert/embeddings/add, Execution Time: 0.002179 seconds
Node: bert/embeddings/add_1, Execution Time: 0.001005 seconds
Node: bert/embeddings/LayerNorm/moments/mean, Execution Time: 0.005431 seconds
Node: bert/embeddings/LayerNorm/moments/SquaredDifference, Execution Time: 0.000726 seconds
Node: bert/embeddings/LayerNorm/moments/SquaredDifference__72, Execution Time: 0.000863 seconds
Node: bert/embeddings/LayerNorm/moments/variance, Execution Time: 0.000273 seconds
Node: bert/embeddings/LayerNorm/batchnorm/add, Execution Time: 0.000082 seconds
Node: bert/embeddings/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.010061 seconds
Node: bert/embeddings/LayerNorm/batchnorm/Rsqrt__74, Execution Time: 0.005280 seconds
Node: bert/embeddings/LayerNorm/batchnorm/mul, Execution Time: 0.000104 seconds
Node: bert/embeddings/LayerNorm/batchnorm/mul_2, Execution Time: 0.000079 seconds
Node: bert/embeddings/LayerNorm/batchnorm/sub, Execution Time: 0.000088 seconds
Node: bert/embeddings/LayerNorm/batchnorm/mul_1, Execution Time: 0.000667 seconds
Node: bert/embeddings/LayerNorm/batchnorm/add_1, Execution Time: 0.000691 seconds
Node: bert/encoder/Reshape_1, Execution Time: 0.000034 seconds
Matmul Fuse Node: bert/encoder/layer_0/attention/self/value/MatMul, Execution Time: 0.035522 seconds
Node: bert/encoder/layer_0/attention/self/Reshape_2, Execution Time: 0.000023 seconds
Node: bert/encoder/layer_0/attention/self/transpose_2, Execution Time: 0.000437 seconds
Matmul Fuse Node: bert/encoder/layer_0/attention/self/query/MatMul, Execution Time: 0.003021 seconds
Node: bert/encoder/layer_0/attention/self/Reshape, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_0/attention/self/transpose, Execution Time: 0.000255 seconds
Matmul Fuse Node: bert/encoder/layer_0/attention/self/key/MatMul, Execution Time: 0.003390 seconds
Node: bert/encoder/layer_0/attention/self/Reshape_1, Execution Time: 0.000015 seconds
Node: bert/encoder/layer_0/attention/self/MatMul__306, Execution Time: 0.000262 seconds
Node: bert/encoder/layer_0/attention/self/MatMul, Execution Time: 0.005664 seconds
Node: bert/encoder/layer_0/attention/self/Mul, Execution Time: 0.001250 seconds
Node: bert/encoder/layer_0/attention/self/add, Execution Time: 0.001806 seconds
Node: bert/encoder/layer_0/attention/self/Softmax, Execution Time: 0.009676 seconds
Node: bert/encoder/layer_0/attention/self/MatMul_1, Execution Time: 0.001447 seconds
Node: bert/encoder/layer_0/attention/self/transpose_3, Execution Time: 0.000253 seconds
Node: bert/encoder/layer_0/attention/self/Reshape_3, Execution Time: 0.000055 seconds
Matmul Fuse Node: bert/encoder/layer_0/attention/output/dense/MatMul, Execution Time: 0.001492 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/moments/mean, Execution Time: 0.000218 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000390 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/moments/SquaredDifference__309, Execution Time: 0.000521 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/moments/variance, Execution Time: 0.000165 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000122 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/Rsqrt__311, Execution Time: 0.000131 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000120 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000568 seconds
Node: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000690 seconds
Matmul Fuse Node: bert/encoder/layer_0/intermediate/dense/MatMul, Execution Time: 0.006251 seconds
Node: bert/encoder/layer_0/intermediate/dense/Pow, Execution Time: 0.017836 seconds
Node: bert/encoder/layer_0/intermediate/dense/mul, Execution Time: 0.001201 seconds
Node: bert/encoder/layer_0/intermediate/dense/add, Execution Time: 0.001495 seconds
Node: bert/encoder/layer_0/intermediate/dense/mul_1, Execution Time: 0.001162 seconds
Node: bert/encoder/layer_0/intermediate/dense/Tanh, Execution Time: 0.003510 seconds
Node: bert/encoder/layer_0/intermediate/dense/add_1, Execution Time: 0.001209 seconds
Node: bert/encoder/layer_0/intermediate/dense/mul_2, Execution Time: 0.001196 seconds
Node: bert/encoder/layer_0/intermediate/dense/mul_3, Execution Time: 0.001443 seconds
Matmul Fuse Node: bert/encoder/layer_0/output/dense/MatMul, Execution Time: 0.002903 seconds
Node: bert/encoder/layer_0/output/LayerNorm/moments/mean, Execution Time: 0.000174 seconds
Node: bert/encoder/layer_0/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000250 seconds
Node: bert/encoder/layer_0/output/LayerNorm/moments/SquaredDifference__313, Execution Time: 0.000330 seconds
Node: bert/encoder/layer_0/output/LayerNorm/moments/variance, Execution Time: 0.000158 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/add, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/Rsqrt__315, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/mul, Execution Time: 0.000078 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000091 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/sub, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000340 seconds
Node: bert/encoder/layer_0/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000436 seconds
Matmul Fuse Node: bert/encoder/layer_1/attention/self/value/MatMul, Execution Time: 0.002877 seconds
Node: bert/encoder/layer_1/attention/self/Reshape_2, Execution Time: 0.000021 seconds
Node: bert/encoder/layer_1/attention/self/transpose_2, Execution Time: 0.000252 seconds
Matmul Fuse Node: bert/encoder/layer_1/attention/self/query/MatMul, Execution Time: 0.002599 seconds
Node: bert/encoder/layer_1/attention/self/Reshape, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_1/attention/self/transpose, Execution Time: 0.000255 seconds
Matmul Fuse Node: bert/encoder/layer_1/attention/self/key/MatMul, Execution Time: 0.002547 seconds
Node: bert/encoder/layer_1/attention/self/Reshape_1, Execution Time: 0.000012 seconds
Node: bert/encoder/layer_1/attention/self/MatMul__320, Execution Time: 0.000247 seconds
Node: bert/encoder/layer_1/attention/self/MatMul, Execution Time: 0.001364 seconds
Node: bert/encoder/layer_1/attention/self/Mul, Execution Time: 0.001176 seconds
Node: bert/encoder/layer_1/attention/self/add, Execution Time: 0.001914 seconds
Node: bert/encoder/layer_1/attention/self/Softmax, Execution Time: 0.001903 seconds
Node: bert/encoder/layer_1/attention/self/MatMul_1, Execution Time: 0.001078 seconds
Node: bert/encoder/layer_1/attention/self/transpose_3, Execution Time: 0.000241 seconds
Node: bert/encoder/layer_1/attention/self/Reshape_3, Execution Time: 0.000048 seconds
Matmul Fuse Node: bert/encoder/layer_1/attention/output/dense/MatMul, Execution Time: 0.001257 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/moments/mean, Execution Time: 0.000174 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000249 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/moments/SquaredDifference__323, Execution Time: 0.000343 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/moments/variance, Execution Time: 0.000155 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000060 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/Rsqrt__325, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000078 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000092 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000334 seconds
Node: bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000696 seconds
Matmul Fuse Node: bert/encoder/layer_1/intermediate/dense/MatMul, Execution Time: 0.004094 seconds
Node: bert/encoder/layer_1/intermediate/dense/Pow, Execution Time: 0.000729 seconds
Node: bert/encoder/layer_1/intermediate/dense/mul, Execution Time: 0.001168 seconds
Node: bert/encoder/layer_1/intermediate/dense/add, Execution Time: 0.001503 seconds
Node: bert/encoder/layer_1/intermediate/dense/mul_1, Execution Time: 0.001165 seconds
Node: bert/encoder/layer_1/intermediate/dense/Tanh, Execution Time: 0.001071 seconds
Node: bert/encoder/layer_1/intermediate/dense/add_1, Execution Time: 0.001120 seconds
Node: bert/encoder/layer_1/intermediate/dense/mul_2, Execution Time: 0.001135 seconds
Node: bert/encoder/layer_1/intermediate/dense/mul_3, Execution Time: 0.001441 seconds
Matmul Fuse Node: bert/encoder/layer_1/output/dense/MatMul, Execution Time: 0.002657 seconds
Node: bert/encoder/layer_1/output/LayerNorm/moments/mean, Execution Time: 0.000176 seconds
Node: bert/encoder/layer_1/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000243 seconds
Node: bert/encoder/layer_1/output/LayerNorm/moments/SquaredDifference__327, Execution Time: 0.000338 seconds
Node: bert/encoder/layer_1/output/LayerNorm/moments/variance, Execution Time: 0.000164 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/add, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000060 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/Rsqrt__329, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/mul, Execution Time: 0.000077 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000092 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/sub, Execution Time: 0.000083 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000336 seconds
Node: bert/encoder/layer_1/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000424 seconds
Matmul Fuse Node: bert/encoder/layer_2/attention/self/value/MatMul, Execution Time: 0.003581 seconds
Node: bert/encoder/layer_2/attention/self/Reshape_2, Execution Time: 0.000026 seconds
Node: bert/encoder/layer_2/attention/self/transpose_2, Execution Time: 0.000279 seconds
Matmul Fuse Node: bert/encoder/layer_2/attention/self/query/MatMul, Execution Time: 0.003191 seconds
Node: bert/encoder/layer_2/attention/self/Reshape, Execution Time: 0.000017 seconds
Node: bert/encoder/layer_2/attention/self/transpose, Execution Time: 0.000271 seconds
Matmul Fuse Node: bert/encoder/layer_2/attention/self/key/MatMul, Execution Time: 0.003162 seconds
Node: bert/encoder/layer_2/attention/self/Reshape_1, Execution Time: 0.000015 seconds
Node: bert/encoder/layer_2/attention/self/MatMul__334, Execution Time: 0.000279 seconds
Node: bert/encoder/layer_2/attention/self/MatMul, Execution Time: 0.001417 seconds
Node: bert/encoder/layer_2/attention/self/Mul, Execution Time: 0.001199 seconds
Node: bert/encoder/layer_2/attention/self/add, Execution Time: 0.001837 seconds
Node: bert/encoder/layer_2/attention/self/Softmax, Execution Time: 0.001921 seconds
Node: bert/encoder/layer_2/attention/self/MatMul_1, Execution Time: 0.001089 seconds
Node: bert/encoder/layer_2/attention/self/transpose_3, Execution Time: 0.000246 seconds
Node: bert/encoder/layer_2/attention/self/Reshape_3, Execution Time: 0.000047 seconds
Matmul Fuse Node: bert/encoder/layer_2/attention/output/dense/MatMul, Execution Time: 0.001225 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/moments/mean, Execution Time: 0.000175 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000243 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/moments/SquaredDifference__337, Execution Time: 0.000339 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/moments/variance, Execution Time: 0.000157 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/Rsqrt__339, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000096 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000080 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000582 seconds
Node: bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000878 seconds
Matmul Fuse Node: bert/encoder/layer_2/intermediate/dense/MatMul, Execution Time: 0.004214 seconds
Node: bert/encoder/layer_2/intermediate/dense/Pow, Execution Time: 0.000725 seconds
Node: bert/encoder/layer_2/intermediate/dense/mul, Execution Time: 0.001163 seconds
Node: bert/encoder/layer_2/intermediate/dense/add, Execution Time: 0.001479 seconds
Node: bert/encoder/layer_2/intermediate/dense/mul_1, Execution Time: 0.001115 seconds
Node: bert/encoder/layer_2/intermediate/dense/Tanh, Execution Time: 0.001115 seconds
Node: bert/encoder/layer_2/intermediate/dense/add_1, Execution Time: 0.001124 seconds
Node: bert/encoder/layer_2/intermediate/dense/mul_2, Execution Time: 0.001173 seconds
Node: bert/encoder/layer_2/intermediate/dense/mul_3, Execution Time: 0.001371 seconds
Matmul Fuse Node: bert/encoder/layer_2/output/dense/MatMul, Execution Time: 0.002838 seconds
Node: bert/encoder/layer_2/output/LayerNorm/moments/mean, Execution Time: 0.000176 seconds
Node: bert/encoder/layer_2/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000256 seconds
Node: bert/encoder/layer_2/output/LayerNorm/moments/SquaredDifference__341, Execution Time: 0.000376 seconds
Node: bert/encoder/layer_2/output/LayerNorm/moments/variance, Execution Time: 0.000160 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/add, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/Rsqrt__343, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/mul, Execution Time: 0.000089 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000085 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/sub, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000336 seconds
Node: bert/encoder/layer_2/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000438 seconds
Matmul Fuse Node: bert/encoder/layer_3/attention/self/value/MatMul, Execution Time: 0.002847 seconds
Node: bert/encoder/layer_3/attention/self/Reshape_2, Execution Time: 0.000021 seconds
Node: bert/encoder/layer_3/attention/self/transpose_2, Execution Time: 0.000258 seconds
Matmul Fuse Node: bert/encoder/layer_3/attention/self/query/MatMul, Execution Time: 0.002538 seconds
Node: bert/encoder/layer_3/attention/self/Reshape, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_3/attention/self/transpose, Execution Time: 0.000247 seconds
Matmul Fuse Node: bert/encoder/layer_3/attention/self/key/MatMul, Execution Time: 0.002475 seconds
Node: bert/encoder/layer_3/attention/self/Reshape_1, Execution Time: 0.000012 seconds
Node: bert/encoder/layer_3/attention/self/MatMul__348, Execution Time: 0.000247 seconds
Node: bert/encoder/layer_3/attention/self/MatMul, Execution Time: 0.001355 seconds
Node: bert/encoder/layer_3/attention/self/Mul, Execution Time: 0.001203 seconds
Node: bert/encoder/layer_3/attention/self/add, Execution Time: 0.001965 seconds
Node: bert/encoder/layer_3/attention/self/Softmax, Execution Time: 0.002022 seconds
Node: bert/encoder/layer_3/attention/self/MatMul_1, Execution Time: 0.001136 seconds
Node: bert/encoder/layer_3/attention/self/transpose_3, Execution Time: 0.000241 seconds
Node: bert/encoder/layer_3/attention/self/Reshape_3, Execution Time: 0.000049 seconds
Matmul Fuse Node: bert/encoder/layer_3/attention/output/dense/MatMul, Execution Time: 0.001128 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/moments/mean, Execution Time: 0.000180 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000386 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/moments/SquaredDifference__351, Execution Time: 0.000455 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/moments/variance, Execution Time: 0.000154 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000088 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/Rsqrt__353, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000570 seconds
Node: bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000667 seconds
Matmul Fuse Node: bert/encoder/layer_3/intermediate/dense/MatMul, Execution Time: 0.004129 seconds
Node: bert/encoder/layer_3/intermediate/dense/Pow, Execution Time: 0.000726 seconds
Node: bert/encoder/layer_3/intermediate/dense/mul, Execution Time: 0.001235 seconds
Node: bert/encoder/layer_3/intermediate/dense/add, Execution Time: 0.001406 seconds
Node: bert/encoder/layer_3/intermediate/dense/mul_1, Execution Time: 0.001123 seconds
Node: bert/encoder/layer_3/intermediate/dense/Tanh, Execution Time: 0.001110 seconds
Node: bert/encoder/layer_3/intermediate/dense/add_1, Execution Time: 0.001189 seconds
Node: bert/encoder/layer_3/intermediate/dense/mul_2, Execution Time: 0.001317 seconds
Node: bert/encoder/layer_3/intermediate/dense/mul_3, Execution Time: 0.001786 seconds
Matmul Fuse Node: bert/encoder/layer_3/output/dense/MatMul, Execution Time: 0.003806 seconds
Node: bert/encoder/layer_3/output/LayerNorm/moments/mean, Execution Time: 0.000238 seconds
Node: bert/encoder/layer_3/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000318 seconds
Node: bert/encoder/layer_3/output/LayerNorm/moments/SquaredDifference__355, Execution Time: 0.000446 seconds
Node: bert/encoder/layer_3/output/LayerNorm/moments/variance, Execution Time: 0.000205 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/add, Execution Time: 0.000099 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/Rsqrt__357, Execution Time: 0.000107 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/mul, Execution Time: 0.000096 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000088 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/sub, Execution Time: 0.000113 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000434 seconds
Node: bert/encoder/layer_3/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000534 seconds
Matmul Fuse Node: bert/encoder/layer_4/attention/self/value/MatMul, Execution Time: 0.002899 seconds
Node: bert/encoder/layer_4/attention/self/Reshape_2, Execution Time: 0.000021 seconds
Node: bert/encoder/layer_4/attention/self/transpose_2, Execution Time: 0.000262 seconds
Matmul Fuse Node: bert/encoder/layer_4/attention/self/query/MatMul, Execution Time: 0.002542 seconds
Node: bert/encoder/layer_4/attention/self/Reshape, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_4/attention/self/transpose, Execution Time: 0.000249 seconds
Matmul Fuse Node: bert/encoder/layer_4/attention/self/key/MatMul, Execution Time: 0.002438 seconds
Node: bert/encoder/layer_4/attention/self/Reshape_1, Execution Time: 0.000012 seconds
Node: bert/encoder/layer_4/attention/self/MatMul__362, Execution Time: 0.000249 seconds
Node: bert/encoder/layer_4/attention/self/MatMul, Execution Time: 0.001404 seconds
Node: bert/encoder/layer_4/attention/self/Mul, Execution Time: 0.001179 seconds
Node: bert/encoder/layer_4/attention/self/add, Execution Time: 0.001856 seconds
Node: bert/encoder/layer_4/attention/self/Softmax, Execution Time: 0.002107 seconds
Node: bert/encoder/layer_4/attention/self/MatMul_1, Execution Time: 0.001110 seconds
Node: bert/encoder/layer_4/attention/self/transpose_3, Execution Time: 0.000237 seconds
Node: bert/encoder/layer_4/attention/self/Reshape_3, Execution Time: 0.000047 seconds
Matmul Fuse Node: bert/encoder/layer_4/attention/output/dense/MatMul, Execution Time: 0.001309 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/moments/mean, Execution Time: 0.000175 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000243 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/moments/SquaredDifference__365, Execution Time: 0.000333 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/moments/variance, Execution Time: 0.000156 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000057 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/Rsqrt__367, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000078 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000587 seconds
Node: bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000684 seconds
Matmul Fuse Node: bert/encoder/layer_4/intermediate/dense/MatMul, Execution Time: 0.004243 seconds
Node: bert/encoder/layer_4/intermediate/dense/Pow, Execution Time: 0.000731 seconds
Node: bert/encoder/layer_4/intermediate/dense/mul, Execution Time: 0.001199 seconds
Node: bert/encoder/layer_4/intermediate/dense/add, Execution Time: 0.001486 seconds
Node: bert/encoder/layer_4/intermediate/dense/mul_1, Execution Time: 0.001201 seconds
Node: bert/encoder/layer_4/intermediate/dense/Tanh, Execution Time: 0.001122 seconds
Node: bert/encoder/layer_4/intermediate/dense/add_1, Execution Time: 0.001201 seconds
Node: bert/encoder/layer_4/intermediate/dense/mul_2, Execution Time: 0.001146 seconds
Node: bert/encoder/layer_4/intermediate/dense/mul_3, Execution Time: 0.001469 seconds
Matmul Fuse Node: bert/encoder/layer_4/output/dense/MatMul, Execution Time: 0.002865 seconds
Node: bert/encoder/layer_4/output/LayerNorm/moments/mean, Execution Time: 0.000179 seconds
Node: bert/encoder/layer_4/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000241 seconds
Node: bert/encoder/layer_4/output/LayerNorm/moments/SquaredDifference__369, Execution Time: 0.000339 seconds
Node: bert/encoder/layer_4/output/LayerNorm/moments/variance, Execution Time: 0.000156 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/add, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/Rsqrt__371, Execution Time: 0.000083 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/mul, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000092 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/sub, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000341 seconds
Node: bert/encoder/layer_4/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000435 seconds
Matmul Fuse Node: bert/encoder/layer_5/attention/self/value/MatMul, Execution Time: 0.003960 seconds
Node: bert/encoder/layer_5/attention/self/Reshape_2, Execution Time: 0.000022 seconds
Node: bert/encoder/layer_5/attention/self/transpose_2, Execution Time: 0.000257 seconds
Matmul Fuse Node: bert/encoder/layer_5/attention/self/query/MatMul, Execution Time: 0.002440 seconds
Node: bert/encoder/layer_5/attention/self/Reshape, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_5/attention/self/transpose, Execution Time: 0.000252 seconds
Matmul Fuse Node: bert/encoder/layer_5/attention/self/key/MatMul, Execution Time: 0.002446 seconds
Node: bert/encoder/layer_5/attention/self/Reshape_1, Execution Time: 0.000011 seconds
Node: bert/encoder/layer_5/attention/self/MatMul__376, Execution Time: 0.000251 seconds
Node: bert/encoder/layer_5/attention/self/MatMul, Execution Time: 0.001244 seconds
Node: bert/encoder/layer_5/attention/self/Mul, Execution Time: 0.001164 seconds
Node: bert/encoder/layer_5/attention/self/add, Execution Time: 0.001820 seconds
Node: bert/encoder/layer_5/attention/self/Softmax, Execution Time: 0.001953 seconds
Node: bert/encoder/layer_5/attention/self/MatMul_1, Execution Time: 0.001064 seconds
Node: bert/encoder/layer_5/attention/self/transpose_3, Execution Time: 0.000276 seconds
Node: bert/encoder/layer_5/attention/self/Reshape_3, Execution Time: 0.000045 seconds
Matmul Fuse Node: bert/encoder/layer_5/attention/output/dense/MatMul, Execution Time: 0.001237 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/moments/mean, Execution Time: 0.000190 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000249 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/moments/SquaredDifference__379, Execution Time: 0.000471 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/moments/variance, Execution Time: 0.000163 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/Rsqrt__381, Execution Time: 0.000080 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000098 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000080 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000573 seconds
Node: bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000682 seconds
Matmul Fuse Node: bert/encoder/layer_5/intermediate/dense/MatMul, Execution Time: 0.004223 seconds
Node: bert/encoder/layer_5/intermediate/dense/Pow, Execution Time: 0.000738 seconds
Node: bert/encoder/layer_5/intermediate/dense/mul, Execution Time: 0.001173 seconds
Node: bert/encoder/layer_5/intermediate/dense/add, Execution Time: 0.001558 seconds
Node: bert/encoder/layer_5/intermediate/dense/mul_1, Execution Time: 0.001120 seconds
Node: bert/encoder/layer_5/intermediate/dense/Tanh, Execution Time: 0.001077 seconds
Node: bert/encoder/layer_5/intermediate/dense/add_1, Execution Time: 0.001116 seconds
Node: bert/encoder/layer_5/intermediate/dense/mul_2, Execution Time: 0.001096 seconds
Node: bert/encoder/layer_5/intermediate/dense/mul_3, Execution Time: 0.001424 seconds
Matmul Fuse Node: bert/encoder/layer_5/output/dense/MatMul, Execution Time: 0.002838 seconds
Node: bert/encoder/layer_5/output/LayerNorm/moments/mean, Execution Time: 0.000184 seconds
Node: bert/encoder/layer_5/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000249 seconds
Node: bert/encoder/layer_5/output/LayerNorm/moments/SquaredDifference__383, Execution Time: 0.000327 seconds
Node: bert/encoder/layer_5/output/LayerNorm/moments/variance, Execution Time: 0.000155 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/add, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000060 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/Rsqrt__385, Execution Time: 0.000090 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/mul, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000075 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/sub, Execution Time: 0.000085 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000334 seconds
Node: bert/encoder/layer_5/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000436 seconds
Matmul Fuse Node: bert/encoder/layer_6/attention/self/value/MatMul, Execution Time: 0.002932 seconds
Node: bert/encoder/layer_6/attention/self/Reshape_2, Execution Time: 0.000021 seconds
Node: bert/encoder/layer_6/attention/self/transpose_2, Execution Time: 0.000256 seconds
Matmul Fuse Node: bert/encoder/layer_6/attention/self/query/MatMul, Execution Time: 0.002506 seconds
Node: bert/encoder/layer_6/attention/self/Reshape, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_6/attention/self/transpose, Execution Time: 0.000253 seconds
Matmul Fuse Node: bert/encoder/layer_6/attention/self/key/MatMul, Execution Time: 0.002383 seconds
Node: bert/encoder/layer_6/attention/self/Reshape_1, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_6/attention/self/MatMul__390, Execution Time: 0.000243 seconds
Node: bert/encoder/layer_6/attention/self/MatMul, Execution Time: 0.001354 seconds
Node: bert/encoder/layer_6/attention/self/Mul, Execution Time: 0.001219 seconds
Node: bert/encoder/layer_6/attention/self/add, Execution Time: 0.001827 seconds
Node: bert/encoder/layer_6/attention/self/Softmax, Execution Time: 0.002038 seconds
Node: bert/encoder/layer_6/attention/self/MatMul_1, Execution Time: 0.001150 seconds
Node: bert/encoder/layer_6/attention/self/transpose_3, Execution Time: 0.000246 seconds
Node: bert/encoder/layer_6/attention/self/Reshape_3, Execution Time: 0.000048 seconds
Matmul Fuse Node: bert/encoder/layer_6/attention/output/dense/MatMul, Execution Time: 0.001310 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/moments/mean, Execution Time: 0.000202 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000269 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/moments/SquaredDifference__393, Execution Time: 0.000342 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/moments/variance, Execution Time: 0.000169 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000093 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000066 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/Rsqrt__395, Execution Time: 0.000086 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000098 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000085 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000093 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000468 seconds
Node: bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000721 seconds
Matmul Fuse Node: bert/encoder/layer_6/intermediate/dense/MatMul, Execution Time: 0.004496 seconds
Node: bert/encoder/layer_6/intermediate/dense/Pow, Execution Time: 0.000743 seconds
Node: bert/encoder/layer_6/intermediate/dense/mul, Execution Time: 0.001149 seconds
Node: bert/encoder/layer_6/intermediate/dense/add, Execution Time: 0.001630 seconds
Node: bert/encoder/layer_6/intermediate/dense/mul_1, Execution Time: 0.001597 seconds
Node: bert/encoder/layer_6/intermediate/dense/Tanh, Execution Time: 0.001440 seconds
Node: bert/encoder/layer_6/intermediate/dense/add_1, Execution Time: 0.001472 seconds
Node: bert/encoder/layer_6/intermediate/dense/mul_2, Execution Time: 0.001459 seconds
Node: bert/encoder/layer_6/intermediate/dense/mul_3, Execution Time: 0.001819 seconds
Matmul Fuse Node: bert/encoder/layer_6/output/dense/MatMul, Execution Time: 0.002854 seconds
Node: bert/encoder/layer_6/output/LayerNorm/moments/mean, Execution Time: 0.000175 seconds
Node: bert/encoder/layer_6/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000244 seconds
Node: bert/encoder/layer_6/output/LayerNorm/moments/SquaredDifference__397, Execution Time: 0.000333 seconds
Node: bert/encoder/layer_6/output/LayerNorm/moments/variance, Execution Time: 0.000155 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/add, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000057 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/Rsqrt__399, Execution Time: 0.000099 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/mul, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000077 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/sub, Execution Time: 0.000078 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000335 seconds
Node: bert/encoder/layer_6/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000436 seconds
Matmul Fuse Node: bert/encoder/layer_7/attention/self/value/MatMul, Execution Time: 0.002808 seconds
Node: bert/encoder/layer_7/attention/self/Reshape_2, Execution Time: 0.000021 seconds
Node: bert/encoder/layer_7/attention/self/transpose_2, Execution Time: 0.000246 seconds
Matmul Fuse Node: bert/encoder/layer_7/attention/self/query/MatMul, Execution Time: 0.002673 seconds
Node: bert/encoder/layer_7/attention/self/Reshape, Execution Time: 0.000015 seconds
Node: bert/encoder/layer_7/attention/self/transpose, Execution Time: 0.000253 seconds
Matmul Fuse Node: bert/encoder/layer_7/attention/self/key/MatMul, Execution Time: 0.002464 seconds
Node: bert/encoder/layer_7/attention/self/Reshape_1, Execution Time: 0.000012 seconds
Node: bert/encoder/layer_7/attention/self/MatMul__404, Execution Time: 0.000244 seconds
Node: bert/encoder/layer_7/attention/self/MatMul, Execution Time: 0.001390 seconds
Node: bert/encoder/layer_7/attention/self/Mul, Execution Time: 0.001194 seconds
Node: bert/encoder/layer_7/attention/self/add, Execution Time: 0.001980 seconds
Node: bert/encoder/layer_7/attention/self/Softmax, Execution Time: 0.001891 seconds
Node: bert/encoder/layer_7/attention/self/MatMul_1, Execution Time: 0.001181 seconds
Node: bert/encoder/layer_7/attention/self/transpose_3, Execution Time: 0.000238 seconds
Node: bert/encoder/layer_7/attention/self/Reshape_3, Execution Time: 0.000046 seconds
Matmul Fuse Node: bert/encoder/layer_7/attention/output/dense/MatMul, Execution Time: 0.001251 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/moments/mean, Execution Time: 0.000179 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000251 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/moments/SquaredDifference__407, Execution Time: 0.000348 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/moments/variance, Execution Time: 0.000175 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000058 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/Rsqrt__409, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000098 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000087 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000552 seconds
Node: bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000713 seconds
Matmul Fuse Node: bert/encoder/layer_7/intermediate/dense/MatMul, Execution Time: 0.004113 seconds
Node: bert/encoder/layer_7/intermediate/dense/Pow, Execution Time: 0.000735 seconds
Node: bert/encoder/layer_7/intermediate/dense/mul, Execution Time: 0.001155 seconds
Node: bert/encoder/layer_7/intermediate/dense/add, Execution Time: 0.001512 seconds
Node: bert/encoder/layer_7/intermediate/dense/mul_1, Execution Time: 0.001128 seconds
Node: bert/encoder/layer_7/intermediate/dense/Tanh, Execution Time: 0.001093 seconds
Node: bert/encoder/layer_7/intermediate/dense/add_1, Execution Time: 0.001129 seconds
Node: bert/encoder/layer_7/intermediate/dense/mul_2, Execution Time: 0.001164 seconds
Node: bert/encoder/layer_7/intermediate/dense/mul_3, Execution Time: 0.001491 seconds
Matmul Fuse Node: bert/encoder/layer_7/output/dense/MatMul, Execution Time: 0.002885 seconds
Node: bert/encoder/layer_7/output/LayerNorm/moments/mean, Execution Time: 0.000183 seconds
Node: bert/encoder/layer_7/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000250 seconds
Node: bert/encoder/layer_7/output/LayerNorm/moments/SquaredDifference__411, Execution Time: 0.000327 seconds
Node: bert/encoder/layer_7/output/LayerNorm/moments/variance, Execution Time: 0.000154 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/add, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000058 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/Rsqrt__413, Execution Time: 0.000085 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/mul, Execution Time: 0.000085 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/sub, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000333 seconds
Node: bert/encoder/layer_7/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000437 seconds
Matmul Fuse Node: bert/encoder/layer_8/attention/self/value/MatMul, Execution Time: 0.002903 seconds
Node: bert/encoder/layer_8/attention/self/Reshape_2, Execution Time: 0.000022 seconds
Node: bert/encoder/layer_8/attention/self/transpose_2, Execution Time: 0.000252 seconds
Matmul Fuse Node: bert/encoder/layer_8/attention/self/query/MatMul, Execution Time: 0.002493 seconds
Node: bert/encoder/layer_8/attention/self/Reshape, Execution Time: 0.000014 seconds
Node: bert/encoder/layer_8/attention/self/transpose, Execution Time: 0.000250 seconds
Matmul Fuse Node: bert/encoder/layer_8/attention/self/key/MatMul, Execution Time: 0.002375 seconds
Node: bert/encoder/layer_8/attention/self/Reshape_1, Execution Time: 0.000012 seconds
Node: bert/encoder/layer_8/attention/self/MatMul__418, Execution Time: 0.000246 seconds
Node: bert/encoder/layer_8/attention/self/MatMul, Execution Time: 0.001337 seconds
Node: bert/encoder/layer_8/attention/self/Mul, Execution Time: 0.001168 seconds
Node: bert/encoder/layer_8/attention/self/add, Execution Time: 0.001858 seconds
Node: bert/encoder/layer_8/attention/self/Softmax, Execution Time: 0.002032 seconds
Node: bert/encoder/layer_8/attention/self/MatMul_1, Execution Time: 0.001093 seconds
Node: bert/encoder/layer_8/attention/self/transpose_3, Execution Time: 0.000242 seconds
Node: bert/encoder/layer_8/attention/self/Reshape_3, Execution Time: 0.000047 seconds
Matmul Fuse Node: bert/encoder/layer_8/attention/output/dense/MatMul, Execution Time: 0.001178 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/moments/mean, Execution Time: 0.000181 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000245 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/moments/SquaredDifference__421, Execution Time: 0.000330 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/moments/variance, Execution Time: 0.000159 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000083 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000058 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/Rsqrt__423, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000078 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000559 seconds
Node: bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000686 seconds
Matmul Fuse Node: bert/encoder/layer_8/intermediate/dense/MatMul, Execution Time: 0.004353 seconds
Node: bert/encoder/layer_8/intermediate/dense/Pow, Execution Time: 0.000730 seconds
Node: bert/encoder/layer_8/intermediate/dense/mul, Execution Time: 0.001183 seconds
Node: bert/encoder/layer_8/intermediate/dense/add, Execution Time: 0.001496 seconds
Node: bert/encoder/layer_8/intermediate/dense/mul_1, Execution Time: 0.001133 seconds
Node: bert/encoder/layer_8/intermediate/dense/Tanh, Execution Time: 0.001168 seconds
Node: bert/encoder/layer_8/intermediate/dense/add_1, Execution Time: 0.001154 seconds
Node: bert/encoder/layer_8/intermediate/dense/mul_2, Execution Time: 0.001138 seconds
Node: bert/encoder/layer_8/intermediate/dense/mul_3, Execution Time: 0.001387 seconds
Matmul Fuse Node: bert/encoder/layer_8/output/dense/MatMul, Execution Time: 0.002991 seconds
Node: bert/encoder/layer_8/output/LayerNorm/moments/mean, Execution Time: 0.000182 seconds
Node: bert/encoder/layer_8/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000258 seconds
Node: bert/encoder/layer_8/output/LayerNorm/moments/SquaredDifference__425, Execution Time: 0.000333 seconds
Node: bert/encoder/layer_8/output/LayerNorm/moments/variance, Execution Time: 0.000160 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/add, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/Rsqrt__427, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/mul, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/sub, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000342 seconds
Node: bert/encoder/layer_8/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000425 seconds
Matmul Fuse Node: bert/encoder/layer_9/attention/self/value/MatMul, Execution Time: 0.002894 seconds
Node: bert/encoder/layer_9/attention/self/Reshape_2, Execution Time: 0.000022 seconds
Node: bert/encoder/layer_9/attention/self/transpose_2, Execution Time: 0.000255 seconds
Matmul Fuse Node: bert/encoder/layer_9/attention/self/query/MatMul, Execution Time: 0.002460 seconds
Node: bert/encoder/layer_9/attention/self/Reshape, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_9/attention/self/transpose, Execution Time: 0.000249 seconds
Matmul Fuse Node: bert/encoder/layer_9/attention/self/key/MatMul, Execution Time: 0.002440 seconds
Node: bert/encoder/layer_9/attention/self/Reshape_1, Execution Time: 0.000012 seconds
Node: bert/encoder/layer_9/attention/self/MatMul__432, Execution Time: 0.000247 seconds
Node: bert/encoder/layer_9/attention/self/MatMul, Execution Time: 0.001341 seconds
Node: bert/encoder/layer_9/attention/self/Mul, Execution Time: 0.001163 seconds
Node: bert/encoder/layer_9/attention/self/add, Execution Time: 0.001809 seconds
Node: bert/encoder/layer_9/attention/self/Softmax, Execution Time: 0.001965 seconds
Node: bert/encoder/layer_9/attention/self/MatMul_1, Execution Time: 0.001070 seconds
Node: bert/encoder/layer_9/attention/self/transpose_3, Execution Time: 0.000237 seconds
Node: bert/encoder/layer_9/attention/self/Reshape_3, Execution Time: 0.000045 seconds
Matmul Fuse Node: bert/encoder/layer_9/attention/output/dense/MatMul, Execution Time: 0.001157 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/moments/mean, Execution Time: 0.000197 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000245 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/moments/SquaredDifference__435, Execution Time: 0.000330 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/moments/variance, Execution Time: 0.000159 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/Rsqrt__437, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000101 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000078 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000559 seconds
Node: bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000698 seconds
Matmul Fuse Node: bert/encoder/layer_9/intermediate/dense/MatMul, Execution Time: 0.004072 seconds
Node: bert/encoder/layer_9/intermediate/dense/Pow, Execution Time: 0.000727 seconds
Node: bert/encoder/layer_9/intermediate/dense/mul, Execution Time: 0.001199 seconds
Node: bert/encoder/layer_9/intermediate/dense/add, Execution Time: 0.001454 seconds
Node: bert/encoder/layer_9/intermediate/dense/mul_1, Execution Time: 0.001218 seconds
Node: bert/encoder/layer_9/intermediate/dense/Tanh, Execution Time: 0.001086 seconds
Node: bert/encoder/layer_9/intermediate/dense/add_1, Execution Time: 0.001137 seconds
Node: bert/encoder/layer_9/intermediate/dense/mul_2, Execution Time: 0.001107 seconds
Node: bert/encoder/layer_9/intermediate/dense/mul_3, Execution Time: 0.001379 seconds
Matmul Fuse Node: bert/encoder/layer_9/output/dense/MatMul, Execution Time: 0.002824 seconds
Node: bert/encoder/layer_9/output/LayerNorm/moments/mean, Execution Time: 0.000182 seconds
Node: bert/encoder/layer_9/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000251 seconds
Node: bert/encoder/layer_9/output/LayerNorm/moments/SquaredDifference__439, Execution Time: 0.000329 seconds
Node: bert/encoder/layer_9/output/LayerNorm/moments/variance, Execution Time: 0.000154 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/add, Execution Time: 0.000083 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/Rsqrt__441, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/mul, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000096 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/sub, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000331 seconds
Node: bert/encoder/layer_9/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000436 seconds
Matmul Fuse Node: bert/encoder/layer_10/attention/self/value/MatMul, Execution Time: 0.004091 seconds
Node: bert/encoder/layer_10/attention/self/Reshape_2, Execution Time: 0.000021 seconds
Node: bert/encoder/layer_10/attention/self/transpose_2, Execution Time: 0.000257 seconds
Matmul Fuse Node: bert/encoder/layer_10/attention/self/query/MatMul, Execution Time: 0.002522 seconds
Node: bert/encoder/layer_10/attention/self/Reshape, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_10/attention/self/transpose, Execution Time: 0.000260 seconds
Matmul Fuse Node: bert/encoder/layer_10/attention/self/key/MatMul, Execution Time: 0.002380 seconds
Node: bert/encoder/layer_10/attention/self/Reshape_1, Execution Time: 0.000012 seconds
Node: bert/encoder/layer_10/attention/self/MatMul__446, Execution Time: 0.000246 seconds
Node: bert/encoder/layer_10/attention/self/MatMul, Execution Time: 0.001328 seconds
Node: bert/encoder/layer_10/attention/self/Mul, Execution Time: 0.001195 seconds
Node: bert/encoder/layer_10/attention/self/add, Execution Time: 0.001870 seconds
Node: bert/encoder/layer_10/attention/self/Softmax, Execution Time: 0.001986 seconds
Node: bert/encoder/layer_10/attention/self/MatMul_1, Execution Time: 0.001072 seconds
Node: bert/encoder/layer_10/attention/self/transpose_3, Execution Time: 0.000238 seconds
Node: bert/encoder/layer_10/attention/self/Reshape_3, Execution Time: 0.000051 seconds
Matmul Fuse Node: bert/encoder/layer_10/attention/output/dense/MatMul, Execution Time: 0.001109 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/moments/mean, Execution Time: 0.000175 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000308 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/moments/SquaredDifference__449, Execution Time: 0.000433 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/moments/variance, Execution Time: 0.000211 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000099 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000074 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/Rsqrt__451, Execution Time: 0.000107 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000101 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000102 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000097 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000704 seconds
Node: bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000876 seconds
Matmul Fuse Node: bert/encoder/layer_10/intermediate/dense/MatMul, Execution Time: 0.005988 seconds
Node: bert/encoder/layer_10/intermediate/dense/Pow, Execution Time: 0.000705 seconds
Node: bert/encoder/layer_10/intermediate/dense/mul, Execution Time: 0.001194 seconds
Node: bert/encoder/layer_10/intermediate/dense/add, Execution Time: 0.001401 seconds
Node: bert/encoder/layer_10/intermediate/dense/mul_1, Execution Time: 0.001104 seconds
Node: bert/encoder/layer_10/intermediate/dense/Tanh, Execution Time: 0.001094 seconds
Node: bert/encoder/layer_10/intermediate/dense/add_1, Execution Time: 0.001137 seconds
Node: bert/encoder/layer_10/intermediate/dense/mul_2, Execution Time: 0.001155 seconds
Node: bert/encoder/layer_10/intermediate/dense/mul_3, Execution Time: 0.001385 seconds
Matmul Fuse Node: bert/encoder/layer_10/output/dense/MatMul, Execution Time: 0.002557 seconds
Node: bert/encoder/layer_10/output/LayerNorm/moments/mean, Execution Time: 0.000184 seconds
Node: bert/encoder/layer_10/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000247 seconds
Node: bert/encoder/layer_10/output/LayerNorm/moments/SquaredDifference__453, Execution Time: 0.000327 seconds
Node: bert/encoder/layer_10/output/LayerNorm/moments/variance, Execution Time: 0.000155 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/add, Execution Time: 0.000081 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000059 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/Rsqrt__455, Execution Time: 0.000105 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/mul, Execution Time: 0.000086 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000094 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/sub, Execution Time: 0.000080 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000334 seconds
Node: bert/encoder/layer_10/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000441 seconds
Matmul Fuse Node: bert/encoder/layer_11/attention/self/value/MatMul, Execution Time: 0.002807 seconds
Node: bert/encoder/layer_11/attention/self/Reshape_2, Execution Time: 0.000021 seconds
Node: bert/encoder/layer_11/attention/self/transpose_2, Execution Time: 0.000258 seconds
Matmul Fuse Node: bert/encoder/layer_11/attention/self/query/MatMul, Execution Time: 0.002520 seconds
Node: bert/encoder/layer_11/attention/self/Reshape, Execution Time: 0.000013 seconds
Node: bert/encoder/layer_11/attention/self/transpose, Execution Time: 0.000254 seconds
Matmul Fuse Node: bert/encoder/layer_11/attention/self/key/MatMul, Execution Time: 0.002469 seconds
Node: bert/encoder/layer_11/attention/self/Reshape_1, Execution Time: 0.000012 seconds
Node: bert/encoder/layer_11/attention/self/MatMul__460, Execution Time: 0.000247 seconds
Node: bert/encoder/layer_11/attention/self/MatMul, Execution Time: 0.001390 seconds
Node: bert/encoder/layer_11/attention/self/Mul, Execution Time: 0.001185 seconds
Node: bert/encoder/layer_11/attention/self/add, Execution Time: 0.001868 seconds
Node: bert/encoder/layer_11/attention/self/Softmax, Execution Time: 0.001983 seconds
Node: bert/encoder/layer_11/attention/self/MatMul_1, Execution Time: 0.001104 seconds
Node: bert/encoder/layer_11/attention/self/transpose_3, Execution Time: 0.000240 seconds
Node: bert/encoder/layer_11/attention/self/Reshape_3, Execution Time: 0.000048 seconds
Matmul Fuse Node: bert/encoder/layer_11/attention/output/dense/MatMul, Execution Time: 0.001165 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/moments/mean, Execution Time: 0.000172 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000242 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/moments/SquaredDifference__463, Execution Time: 0.000333 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/moments/variance, Execution Time: 0.000166 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/add, Execution Time: 0.000079 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000058 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/Rsqrt__465, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul, Execution Time: 0.000076 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000074 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/sub, Execution Time: 0.000076 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000562 seconds
Node: bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000669 seconds
Matmul Fuse Node: bert/encoder/layer_11/intermediate/dense/MatMul, Execution Time: 0.004179 seconds
Node: bert/encoder/layer_11/intermediate/dense/Pow, Execution Time: 0.000729 seconds
Node: bert/encoder/layer_11/intermediate/dense/mul, Execution Time: 0.001209 seconds
Node: bert/encoder/layer_11/intermediate/dense/add, Execution Time: 0.001439 seconds
Node: bert/encoder/layer_11/intermediate/dense/mul_1, Execution Time: 0.001104 seconds
Node: bert/encoder/layer_11/intermediate/dense/Tanh, Execution Time: 0.001094 seconds
Node: bert/encoder/layer_11/intermediate/dense/add_1, Execution Time: 0.001142 seconds
Node: bert/encoder/layer_11/intermediate/dense/mul_2, Execution Time: 0.001143 seconds
Node: bert/encoder/layer_11/intermediate/dense/mul_3, Execution Time: 0.001404 seconds
Matmul Fuse Node: bert/encoder/layer_11/output/dense/MatMul, Execution Time: 0.002955 seconds
Node: bert/encoder/layer_11/output/LayerNorm/moments/mean, Execution Time: 0.000190 seconds
Node: bert/encoder/layer_11/output/LayerNorm/moments/SquaredDifference, Execution Time: 0.000246 seconds
Node: bert/encoder/layer_11/output/LayerNorm/moments/SquaredDifference__467, Execution Time: 0.000330 seconds
Node: bert/encoder/layer_11/output/LayerNorm/moments/variance, Execution Time: 0.000155 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/add, Execution Time: 0.000082 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/Rsqrt, Execution Time: 0.000058 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/Rsqrt__469, Execution Time: 0.000086 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/mul, Execution Time: 0.000084 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/mul_2, Execution Time: 0.000095 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/sub, Execution Time: 0.000080 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/mul_1, Execution Time: 0.000335 seconds
Node: bert/encoder/layer_11/output/LayerNorm/batchnorm/add_1, Execution Time: 0.000436 seconds
Matmul Fuse Node: MatMul, Execution Time: 0.004485 seconds
Node: Reshape_1, Execution Time: 0.000021 seconds
Node: transpose, Execution Time: 0.000088 seconds
Node: unstack, Execution Time: 0.000050 seconds
Node: unstack__490, Execution Time: 0.000005 seconds
Node: unstack__488, Execution Time: 0.000006 seconds

Total Execution Time: 0.676711 seconds

Total Matmul Fuse Execution Time: 0.241226 seconds
Execution complete.

Total execution time: 0.680221 seconds
Model outputs: {'unstack:1': array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
       nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype=float16), 'unstack:0': array(None, dtype=object), 'unique_ids:0': array([], dtype=int64)}
Execution order: ['unique_ids_graph_outputs_Identity__10', 'bert/encoder/Shape', 'bert/encoder/Shape__12', 'bert/encoder/strided_slice', 'bert/encoder/strided_slice__16', 'bert/encoder/strided_slice__17', 'bert/encoder/ones/packed_Unsqueeze__18', 'bert/encoder/ones/packed_Concat__21', 'bert/encoder/ones__22', 'bert/encoder/ones', 'bert/encoder/Reshape', 'bert/encoder/Cast', 'bert/encoder/mul', 'bert/encoder/layer_9/attention/self/ExpandDims', 'bert/encoder/layer_9/attention/self/sub', 'bert/encoder/layer_9/attention/self/mul_1', 'bert/embeddings/Reshape_2', 'bert/embeddings/Reshape', 'bert/embeddings/GatherV2', 'bert/embeddings/Reshape_1', 'bert/embeddings/one_hot', 'bert/embeddings/MatMul', 'bert/embeddings/Reshape_3', 'bert/embeddings/add', 'bert/embeddings/add_1', 'bert/embeddings/LayerNorm/moments/mean', 'bert/embeddings/LayerNorm/moments/SquaredDifference', 'bert/embeddings/LayerNorm/moments/SquaredDifference__72', 'bert/embeddings/LayerNorm/moments/variance', 'bert/embeddings/LayerNorm/batchnorm/add', 'bert/embeddings/LayerNorm/batchnorm/Rsqrt', 'bert/embeddings/LayerNorm/batchnorm/Rsqrt__74', 'bert/embeddings/LayerNorm/batchnorm/mul', 'bert/embeddings/LayerNorm/batchnorm/mul_2', 'bert/embeddings/LayerNorm/batchnorm/sub', 'bert/embeddings/LayerNorm/batchnorm/mul_1', 'bert/embeddings/LayerNorm/batchnorm/add_1', 'bert/encoder/Reshape_1', 'bert/encoder/layer_0/attention/self/value/MatMul', 'bert/encoder/layer_0/attention/self/value/BiasAdd', 'bert/encoder/layer_0/attention/self/Reshape_2', 'bert/encoder/layer_0/attention/self/transpose_2', 'bert/encoder/layer_0/attention/self/query/MatMul', 'bert/encoder/layer_0/attention/self/query/BiasAdd', 'bert/encoder/layer_0/attention/self/Reshape', 'bert/encoder/layer_0/attention/self/transpose', 'bert/encoder/layer_0/attention/self/key/MatMul', 'bert/encoder/layer_0/attention/self/key/BiasAdd', 'bert/encoder/layer_0/attention/self/Reshape_1', 'bert/encoder/layer_0/attention/self/MatMul__306', 'bert/encoder/layer_0/attention/self/MatMul', 'bert/encoder/layer_0/attention/self/Mul', 'bert/encoder/layer_0/attention/self/add', 'bert/encoder/layer_0/attention/self/Softmax', 'bert/encoder/layer_0/attention/self/MatMul_1', 'bert/encoder/layer_0/attention/self/transpose_3', 'bert/encoder/layer_0/attention/self/Reshape_3', 'bert/encoder/layer_0/attention/output/dense/MatMul', 'bert/encoder/layer_0/attention/output/dense/BiasAdd', 'bert/encoder/layer_0/attention/output/add', 'bert/encoder/layer_0/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_0/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_0/attention/output/LayerNorm/moments/SquaredDifference__309', 'bert/encoder/layer_0/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/Rsqrt__311', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_0/intermediate/dense/MatMul', 'bert/encoder/layer_0/intermediate/dense/BiasAdd', 'bert/encoder/layer_0/intermediate/dense/Pow', 'bert/encoder/layer_0/intermediate/dense/mul', 'bert/encoder/layer_0/intermediate/dense/add', 'bert/encoder/layer_0/intermediate/dense/mul_1', 'bert/encoder/layer_0/intermediate/dense/Tanh', 'bert/encoder/layer_0/intermediate/dense/add_1', 'bert/encoder/layer_0/intermediate/dense/mul_2', 'bert/encoder/layer_0/intermediate/dense/mul_3', 'bert/encoder/layer_0/output/dense/MatMul', 'bert/encoder/layer_0/output/dense/BiasAdd', 'bert/encoder/layer_0/output/add', 'bert/encoder/layer_0/output/LayerNorm/moments/mean', 'bert/encoder/layer_0/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_0/output/LayerNorm/moments/SquaredDifference__313', 'bert/encoder/layer_0/output/LayerNorm/moments/variance', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/Rsqrt__315', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_0/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_1/attention/self/value/MatMul', 'bert/encoder/layer_1/attention/self/value/BiasAdd', 'bert/encoder/layer_1/attention/self/Reshape_2', 'bert/encoder/layer_1/attention/self/transpose_2', 'bert/encoder/layer_1/attention/self/query/MatMul', 'bert/encoder/layer_1/attention/self/query/BiasAdd', 'bert/encoder/layer_1/attention/self/Reshape', 'bert/encoder/layer_1/attention/self/transpose', 'bert/encoder/layer_1/attention/self/key/MatMul', 'bert/encoder/layer_1/attention/self/key/BiasAdd', 'bert/encoder/layer_1/attention/self/Reshape_1', 'bert/encoder/layer_1/attention/self/MatMul__320', 'bert/encoder/layer_1/attention/self/MatMul', 'bert/encoder/layer_1/attention/self/Mul', 'bert/encoder/layer_1/attention/self/add', 'bert/encoder/layer_1/attention/self/Softmax', 'bert/encoder/layer_1/attention/self/MatMul_1', 'bert/encoder/layer_1/attention/self/transpose_3', 'bert/encoder/layer_1/attention/self/Reshape_3', 'bert/encoder/layer_1/attention/output/dense/MatMul', 'bert/encoder/layer_1/attention/output/dense/BiasAdd', 'bert/encoder/layer_1/attention/output/add', 'bert/encoder/layer_1/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_1/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_1/attention/output/LayerNorm/moments/SquaredDifference__323', 'bert/encoder/layer_1/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/Rsqrt__325', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_1/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_1/intermediate/dense/MatMul', 'bert/encoder/layer_1/intermediate/dense/BiasAdd', 'bert/encoder/layer_1/intermediate/dense/Pow', 'bert/encoder/layer_1/intermediate/dense/mul', 'bert/encoder/layer_1/intermediate/dense/add', 'bert/encoder/layer_1/intermediate/dense/mul_1', 'bert/encoder/layer_1/intermediate/dense/Tanh', 'bert/encoder/layer_1/intermediate/dense/add_1', 'bert/encoder/layer_1/intermediate/dense/mul_2', 'bert/encoder/layer_1/intermediate/dense/mul_3', 'bert/encoder/layer_1/output/dense/MatMul', 'bert/encoder/layer_1/output/dense/BiasAdd', 'bert/encoder/layer_1/output/add', 'bert/encoder/layer_1/output/LayerNorm/moments/mean', 'bert/encoder/layer_1/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_1/output/LayerNorm/moments/SquaredDifference__327', 'bert/encoder/layer_1/output/LayerNorm/moments/variance', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/Rsqrt__329', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_1/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_2/attention/self/value/MatMul', 'bert/encoder/layer_2/attention/self/value/BiasAdd', 'bert/encoder/layer_2/attention/self/Reshape_2', 'bert/encoder/layer_2/attention/self/transpose_2', 'bert/encoder/layer_2/attention/self/query/MatMul', 'bert/encoder/layer_2/attention/self/query/BiasAdd', 'bert/encoder/layer_2/attention/self/Reshape', 'bert/encoder/layer_2/attention/self/transpose', 'bert/encoder/layer_2/attention/self/key/MatMul', 'bert/encoder/layer_2/attention/self/key/BiasAdd', 'bert/encoder/layer_2/attention/self/Reshape_1', 'bert/encoder/layer_2/attention/self/MatMul__334', 'bert/encoder/layer_2/attention/self/MatMul', 'bert/encoder/layer_2/attention/self/Mul', 'bert/encoder/layer_2/attention/self/add', 'bert/encoder/layer_2/attention/self/Softmax', 'bert/encoder/layer_2/attention/self/MatMul_1', 'bert/encoder/layer_2/attention/self/transpose_3', 'bert/encoder/layer_2/attention/self/Reshape_3', 'bert/encoder/layer_2/attention/output/dense/MatMul', 'bert/encoder/layer_2/attention/output/dense/BiasAdd', 'bert/encoder/layer_2/attention/output/add', 'bert/encoder/layer_2/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_2/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_2/attention/output/LayerNorm/moments/SquaredDifference__337', 'bert/encoder/layer_2/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/Rsqrt__339', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_2/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_2/intermediate/dense/MatMul', 'bert/encoder/layer_2/intermediate/dense/BiasAdd', 'bert/encoder/layer_2/intermediate/dense/Pow', 'bert/encoder/layer_2/intermediate/dense/mul', 'bert/encoder/layer_2/intermediate/dense/add', 'bert/encoder/layer_2/intermediate/dense/mul_1', 'bert/encoder/layer_2/intermediate/dense/Tanh', 'bert/encoder/layer_2/intermediate/dense/add_1', 'bert/encoder/layer_2/intermediate/dense/mul_2', 'bert/encoder/layer_2/intermediate/dense/mul_3', 'bert/encoder/layer_2/output/dense/MatMul', 'bert/encoder/layer_2/output/dense/BiasAdd', 'bert/encoder/layer_2/output/add', 'bert/encoder/layer_2/output/LayerNorm/moments/mean', 'bert/encoder/layer_2/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_2/output/LayerNorm/moments/SquaredDifference__341', 'bert/encoder/layer_2/output/LayerNorm/moments/variance', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/Rsqrt__343', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_2/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_3/attention/self/value/MatMul', 'bert/encoder/layer_3/attention/self/value/BiasAdd', 'bert/encoder/layer_3/attention/self/Reshape_2', 'bert/encoder/layer_3/attention/self/transpose_2', 'bert/encoder/layer_3/attention/self/query/MatMul', 'bert/encoder/layer_3/attention/self/query/BiasAdd', 'bert/encoder/layer_3/attention/self/Reshape', 'bert/encoder/layer_3/attention/self/transpose', 'bert/encoder/layer_3/attention/self/key/MatMul', 'bert/encoder/layer_3/attention/self/key/BiasAdd', 'bert/encoder/layer_3/attention/self/Reshape_1', 'bert/encoder/layer_3/attention/self/MatMul__348', 'bert/encoder/layer_3/attention/self/MatMul', 'bert/encoder/layer_3/attention/self/Mul', 'bert/encoder/layer_3/attention/self/add', 'bert/encoder/layer_3/attention/self/Softmax', 'bert/encoder/layer_3/attention/self/MatMul_1', 'bert/encoder/layer_3/attention/self/transpose_3', 'bert/encoder/layer_3/attention/self/Reshape_3', 'bert/encoder/layer_3/attention/output/dense/MatMul', 'bert/encoder/layer_3/attention/output/dense/BiasAdd', 'bert/encoder/layer_3/attention/output/add', 'bert/encoder/layer_3/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_3/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_3/attention/output/LayerNorm/moments/SquaredDifference__351', 'bert/encoder/layer_3/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/Rsqrt__353', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_3/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_3/intermediate/dense/MatMul', 'bert/encoder/layer_3/intermediate/dense/BiasAdd', 'bert/encoder/layer_3/intermediate/dense/Pow', 'bert/encoder/layer_3/intermediate/dense/mul', 'bert/encoder/layer_3/intermediate/dense/add', 'bert/encoder/layer_3/intermediate/dense/mul_1', 'bert/encoder/layer_3/intermediate/dense/Tanh', 'bert/encoder/layer_3/intermediate/dense/add_1', 'bert/encoder/layer_3/intermediate/dense/mul_2', 'bert/encoder/layer_3/intermediate/dense/mul_3', 'bert/encoder/layer_3/output/dense/MatMul', 'bert/encoder/layer_3/output/dense/BiasAdd', 'bert/encoder/layer_3/output/add', 'bert/encoder/layer_3/output/LayerNorm/moments/mean', 'bert/encoder/layer_3/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_3/output/LayerNorm/moments/SquaredDifference__355', 'bert/encoder/layer_3/output/LayerNorm/moments/variance', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/Rsqrt__357', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_3/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_4/attention/self/value/MatMul', 'bert/encoder/layer_4/attention/self/value/BiasAdd', 'bert/encoder/layer_4/attention/self/Reshape_2', 'bert/encoder/layer_4/attention/self/transpose_2', 'bert/encoder/layer_4/attention/self/query/MatMul', 'bert/encoder/layer_4/attention/self/query/BiasAdd', 'bert/encoder/layer_4/attention/self/Reshape', 'bert/encoder/layer_4/attention/self/transpose', 'bert/encoder/layer_4/attention/self/key/MatMul', 'bert/encoder/layer_4/attention/self/key/BiasAdd', 'bert/encoder/layer_4/attention/self/Reshape_1', 'bert/encoder/layer_4/attention/self/MatMul__362', 'bert/encoder/layer_4/attention/self/MatMul', 'bert/encoder/layer_4/attention/self/Mul', 'bert/encoder/layer_4/attention/self/add', 'bert/encoder/layer_4/attention/self/Softmax', 'bert/encoder/layer_4/attention/self/MatMul_1', 'bert/encoder/layer_4/attention/self/transpose_3', 'bert/encoder/layer_4/attention/self/Reshape_3', 'bert/encoder/layer_4/attention/output/dense/MatMul', 'bert/encoder/layer_4/attention/output/dense/BiasAdd', 'bert/encoder/layer_4/attention/output/add', 'bert/encoder/layer_4/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_4/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_4/attention/output/LayerNorm/moments/SquaredDifference__365', 'bert/encoder/layer_4/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/Rsqrt__367', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_4/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_4/intermediate/dense/MatMul', 'bert/encoder/layer_4/intermediate/dense/BiasAdd', 'bert/encoder/layer_4/intermediate/dense/Pow', 'bert/encoder/layer_4/intermediate/dense/mul', 'bert/encoder/layer_4/intermediate/dense/add', 'bert/encoder/layer_4/intermediate/dense/mul_1', 'bert/encoder/layer_4/intermediate/dense/Tanh', 'bert/encoder/layer_4/intermediate/dense/add_1', 'bert/encoder/layer_4/intermediate/dense/mul_2', 'bert/encoder/layer_4/intermediate/dense/mul_3', 'bert/encoder/layer_4/output/dense/MatMul', 'bert/encoder/layer_4/output/dense/BiasAdd', 'bert/encoder/layer_4/output/add', 'bert/encoder/layer_4/output/LayerNorm/moments/mean', 'bert/encoder/layer_4/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_4/output/LayerNorm/moments/SquaredDifference__369', 'bert/encoder/layer_4/output/LayerNorm/moments/variance', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/Rsqrt__371', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_4/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_5/attention/self/value/MatMul', 'bert/encoder/layer_5/attention/self/value/BiasAdd', 'bert/encoder/layer_5/attention/self/Reshape_2', 'bert/encoder/layer_5/attention/self/transpose_2', 'bert/encoder/layer_5/attention/self/query/MatMul', 'bert/encoder/layer_5/attention/self/query/BiasAdd', 'bert/encoder/layer_5/attention/self/Reshape', 'bert/encoder/layer_5/attention/self/transpose', 'bert/encoder/layer_5/attention/self/key/MatMul', 'bert/encoder/layer_5/attention/self/key/BiasAdd', 'bert/encoder/layer_5/attention/self/Reshape_1', 'bert/encoder/layer_5/attention/self/MatMul__376', 'bert/encoder/layer_5/attention/self/MatMul', 'bert/encoder/layer_5/attention/self/Mul', 'bert/encoder/layer_5/attention/self/add', 'bert/encoder/layer_5/attention/self/Softmax', 'bert/encoder/layer_5/attention/self/MatMul_1', 'bert/encoder/layer_5/attention/self/transpose_3', 'bert/encoder/layer_5/attention/self/Reshape_3', 'bert/encoder/layer_5/attention/output/dense/MatMul', 'bert/encoder/layer_5/attention/output/dense/BiasAdd', 'bert/encoder/layer_5/attention/output/add', 'bert/encoder/layer_5/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_5/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_5/attention/output/LayerNorm/moments/SquaredDifference__379', 'bert/encoder/layer_5/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/Rsqrt__381', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_5/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_5/intermediate/dense/MatMul', 'bert/encoder/layer_5/intermediate/dense/BiasAdd', 'bert/encoder/layer_5/intermediate/dense/Pow', 'bert/encoder/layer_5/intermediate/dense/mul', 'bert/encoder/layer_5/intermediate/dense/add', 'bert/encoder/layer_5/intermediate/dense/mul_1', 'bert/encoder/layer_5/intermediate/dense/Tanh', 'bert/encoder/layer_5/intermediate/dense/add_1', 'bert/encoder/layer_5/intermediate/dense/mul_2', 'bert/encoder/layer_5/intermediate/dense/mul_3', 'bert/encoder/layer_5/output/dense/MatMul', 'bert/encoder/layer_5/output/dense/BiasAdd', 'bert/encoder/layer_5/output/add', 'bert/encoder/layer_5/output/LayerNorm/moments/mean', 'bert/encoder/layer_5/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_5/output/LayerNorm/moments/SquaredDifference__383', 'bert/encoder/layer_5/output/LayerNorm/moments/variance', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/Rsqrt__385', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_5/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_6/attention/self/value/MatMul', 'bert/encoder/layer_6/attention/self/value/BiasAdd', 'bert/encoder/layer_6/attention/self/Reshape_2', 'bert/encoder/layer_6/attention/self/transpose_2', 'bert/encoder/layer_6/attention/self/query/MatMul', 'bert/encoder/layer_6/attention/self/query/BiasAdd', 'bert/encoder/layer_6/attention/self/Reshape', 'bert/encoder/layer_6/attention/self/transpose', 'bert/encoder/layer_6/attention/self/key/MatMul', 'bert/encoder/layer_6/attention/self/key/BiasAdd', 'bert/encoder/layer_6/attention/self/Reshape_1', 'bert/encoder/layer_6/attention/self/MatMul__390', 'bert/encoder/layer_6/attention/self/MatMul', 'bert/encoder/layer_6/attention/self/Mul', 'bert/encoder/layer_6/attention/self/add', 'bert/encoder/layer_6/attention/self/Softmax', 'bert/encoder/layer_6/attention/self/MatMul_1', 'bert/encoder/layer_6/attention/self/transpose_3', 'bert/encoder/layer_6/attention/self/Reshape_3', 'bert/encoder/layer_6/attention/output/dense/MatMul', 'bert/encoder/layer_6/attention/output/dense/BiasAdd', 'bert/encoder/layer_6/attention/output/add', 'bert/encoder/layer_6/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_6/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_6/attention/output/LayerNorm/moments/SquaredDifference__393', 'bert/encoder/layer_6/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/Rsqrt__395', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_6/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_6/intermediate/dense/MatMul', 'bert/encoder/layer_6/intermediate/dense/BiasAdd', 'bert/encoder/layer_6/intermediate/dense/Pow', 'bert/encoder/layer_6/intermediate/dense/mul', 'bert/encoder/layer_6/intermediate/dense/add', 'bert/encoder/layer_6/intermediate/dense/mul_1', 'bert/encoder/layer_6/intermediate/dense/Tanh', 'bert/encoder/layer_6/intermediate/dense/add_1', 'bert/encoder/layer_6/intermediate/dense/mul_2', 'bert/encoder/layer_6/intermediate/dense/mul_3', 'bert/encoder/layer_6/output/dense/MatMul', 'bert/encoder/layer_6/output/dense/BiasAdd', 'bert/encoder/layer_6/output/add', 'bert/encoder/layer_6/output/LayerNorm/moments/mean', 'bert/encoder/layer_6/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_6/output/LayerNorm/moments/SquaredDifference__397', 'bert/encoder/layer_6/output/LayerNorm/moments/variance', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/Rsqrt__399', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_6/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_7/attention/self/value/MatMul', 'bert/encoder/layer_7/attention/self/value/BiasAdd', 'bert/encoder/layer_7/attention/self/Reshape_2', 'bert/encoder/layer_7/attention/self/transpose_2', 'bert/encoder/layer_7/attention/self/query/MatMul', 'bert/encoder/layer_7/attention/self/query/BiasAdd', 'bert/encoder/layer_7/attention/self/Reshape', 'bert/encoder/layer_7/attention/self/transpose', 'bert/encoder/layer_7/attention/self/key/MatMul', 'bert/encoder/layer_7/attention/self/key/BiasAdd', 'bert/encoder/layer_7/attention/self/Reshape_1', 'bert/encoder/layer_7/attention/self/MatMul__404', 'bert/encoder/layer_7/attention/self/MatMul', 'bert/encoder/layer_7/attention/self/Mul', 'bert/encoder/layer_7/attention/self/add', 'bert/encoder/layer_7/attention/self/Softmax', 'bert/encoder/layer_7/attention/self/MatMul_1', 'bert/encoder/layer_7/attention/self/transpose_3', 'bert/encoder/layer_7/attention/self/Reshape_3', 'bert/encoder/layer_7/attention/output/dense/MatMul', 'bert/encoder/layer_7/attention/output/dense/BiasAdd', 'bert/encoder/layer_7/attention/output/add', 'bert/encoder/layer_7/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_7/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_7/attention/output/LayerNorm/moments/SquaredDifference__407', 'bert/encoder/layer_7/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/Rsqrt__409', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_7/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_7/intermediate/dense/MatMul', 'bert/encoder/layer_7/intermediate/dense/BiasAdd', 'bert/encoder/layer_7/intermediate/dense/Pow', 'bert/encoder/layer_7/intermediate/dense/mul', 'bert/encoder/layer_7/intermediate/dense/add', 'bert/encoder/layer_7/intermediate/dense/mul_1', 'bert/encoder/layer_7/intermediate/dense/Tanh', 'bert/encoder/layer_7/intermediate/dense/add_1', 'bert/encoder/layer_7/intermediate/dense/mul_2', 'bert/encoder/layer_7/intermediate/dense/mul_3', 'bert/encoder/layer_7/output/dense/MatMul', 'bert/encoder/layer_7/output/dense/BiasAdd', 'bert/encoder/layer_7/output/add', 'bert/encoder/layer_7/output/LayerNorm/moments/mean', 'bert/encoder/layer_7/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_7/output/LayerNorm/moments/SquaredDifference__411', 'bert/encoder/layer_7/output/LayerNorm/moments/variance', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/Rsqrt__413', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_7/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_8/attention/self/value/MatMul', 'bert/encoder/layer_8/attention/self/value/BiasAdd', 'bert/encoder/layer_8/attention/self/Reshape_2', 'bert/encoder/layer_8/attention/self/transpose_2', 'bert/encoder/layer_8/attention/self/query/MatMul', 'bert/encoder/layer_8/attention/self/query/BiasAdd', 'bert/encoder/layer_8/attention/self/Reshape', 'bert/encoder/layer_8/attention/self/transpose', 'bert/encoder/layer_8/attention/self/key/MatMul', 'bert/encoder/layer_8/attention/self/key/BiasAdd', 'bert/encoder/layer_8/attention/self/Reshape_1', 'bert/encoder/layer_8/attention/self/MatMul__418', 'bert/encoder/layer_8/attention/self/MatMul', 'bert/encoder/layer_8/attention/self/Mul', 'bert/encoder/layer_8/attention/self/add', 'bert/encoder/layer_8/attention/self/Softmax', 'bert/encoder/layer_8/attention/self/MatMul_1', 'bert/encoder/layer_8/attention/self/transpose_3', 'bert/encoder/layer_8/attention/self/Reshape_3', 'bert/encoder/layer_8/attention/output/dense/MatMul', 'bert/encoder/layer_8/attention/output/dense/BiasAdd', 'bert/encoder/layer_8/attention/output/add', 'bert/encoder/layer_8/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_8/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_8/attention/output/LayerNorm/moments/SquaredDifference__421', 'bert/encoder/layer_8/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/Rsqrt__423', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_8/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_8/intermediate/dense/MatMul', 'bert/encoder/layer_8/intermediate/dense/BiasAdd', 'bert/encoder/layer_8/intermediate/dense/Pow', 'bert/encoder/layer_8/intermediate/dense/mul', 'bert/encoder/layer_8/intermediate/dense/add', 'bert/encoder/layer_8/intermediate/dense/mul_1', 'bert/encoder/layer_8/intermediate/dense/Tanh', 'bert/encoder/layer_8/intermediate/dense/add_1', 'bert/encoder/layer_8/intermediate/dense/mul_2', 'bert/encoder/layer_8/intermediate/dense/mul_3', 'bert/encoder/layer_8/output/dense/MatMul', 'bert/encoder/layer_8/output/dense/BiasAdd', 'bert/encoder/layer_8/output/add', 'bert/encoder/layer_8/output/LayerNorm/moments/mean', 'bert/encoder/layer_8/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_8/output/LayerNorm/moments/SquaredDifference__425', 'bert/encoder/layer_8/output/LayerNorm/moments/variance', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/Rsqrt__427', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_8/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_9/attention/self/value/MatMul', 'bert/encoder/layer_9/attention/self/value/BiasAdd', 'bert/encoder/layer_9/attention/self/Reshape_2', 'bert/encoder/layer_9/attention/self/transpose_2', 'bert/encoder/layer_9/attention/self/query/MatMul', 'bert/encoder/layer_9/attention/self/query/BiasAdd', 'bert/encoder/layer_9/attention/self/Reshape', 'bert/encoder/layer_9/attention/self/transpose', 'bert/encoder/layer_9/attention/self/key/MatMul', 'bert/encoder/layer_9/attention/self/key/BiasAdd', 'bert/encoder/layer_9/attention/self/Reshape_1', 'bert/encoder/layer_9/attention/self/MatMul__432', 'bert/encoder/layer_9/attention/self/MatMul', 'bert/encoder/layer_9/attention/self/Mul', 'bert/encoder/layer_9/attention/self/add', 'bert/encoder/layer_9/attention/self/Softmax', 'bert/encoder/layer_9/attention/self/MatMul_1', 'bert/encoder/layer_9/attention/self/transpose_3', 'bert/encoder/layer_9/attention/self/Reshape_3', 'bert/encoder/layer_9/attention/output/dense/MatMul', 'bert/encoder/layer_9/attention/output/dense/BiasAdd', 'bert/encoder/layer_9/attention/output/add', 'bert/encoder/layer_9/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_9/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_9/attention/output/LayerNorm/moments/SquaredDifference__435', 'bert/encoder/layer_9/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/Rsqrt__437', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_9/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_9/intermediate/dense/MatMul', 'bert/encoder/layer_9/intermediate/dense/BiasAdd', 'bert/encoder/layer_9/intermediate/dense/Pow', 'bert/encoder/layer_9/intermediate/dense/mul', 'bert/encoder/layer_9/intermediate/dense/add', 'bert/encoder/layer_9/intermediate/dense/mul_1', 'bert/encoder/layer_9/intermediate/dense/Tanh', 'bert/encoder/layer_9/intermediate/dense/add_1', 'bert/encoder/layer_9/intermediate/dense/mul_2', 'bert/encoder/layer_9/intermediate/dense/mul_3', 'bert/encoder/layer_9/output/dense/MatMul', 'bert/encoder/layer_9/output/dense/BiasAdd', 'bert/encoder/layer_9/output/add', 'bert/encoder/layer_9/output/LayerNorm/moments/mean', 'bert/encoder/layer_9/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_9/output/LayerNorm/moments/SquaredDifference__439', 'bert/encoder/layer_9/output/LayerNorm/moments/variance', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/Rsqrt__441', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_9/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_10/attention/self/value/MatMul', 'bert/encoder/layer_10/attention/self/value/BiasAdd', 'bert/encoder/layer_10/attention/self/Reshape_2', 'bert/encoder/layer_10/attention/self/transpose_2', 'bert/encoder/layer_10/attention/self/query/MatMul', 'bert/encoder/layer_10/attention/self/query/BiasAdd', 'bert/encoder/layer_10/attention/self/Reshape', 'bert/encoder/layer_10/attention/self/transpose', 'bert/encoder/layer_10/attention/self/key/MatMul', 'bert/encoder/layer_10/attention/self/key/BiasAdd', 'bert/encoder/layer_10/attention/self/Reshape_1', 'bert/encoder/layer_10/attention/self/MatMul__446', 'bert/encoder/layer_10/attention/self/MatMul', 'bert/encoder/layer_10/attention/self/Mul', 'bert/encoder/layer_10/attention/self/add', 'bert/encoder/layer_10/attention/self/Softmax', 'bert/encoder/layer_10/attention/self/MatMul_1', 'bert/encoder/layer_10/attention/self/transpose_3', 'bert/encoder/layer_10/attention/self/Reshape_3', 'bert/encoder/layer_10/attention/output/dense/MatMul', 'bert/encoder/layer_10/attention/output/dense/BiasAdd', 'bert/encoder/layer_10/attention/output/add', 'bert/encoder/layer_10/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_10/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_10/attention/output/LayerNorm/moments/SquaredDifference__449', 'bert/encoder/layer_10/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/Rsqrt__451', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_10/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_10/intermediate/dense/MatMul', 'bert/encoder/layer_10/intermediate/dense/BiasAdd', 'bert/encoder/layer_10/intermediate/dense/Pow', 'bert/encoder/layer_10/intermediate/dense/mul', 'bert/encoder/layer_10/intermediate/dense/add', 'bert/encoder/layer_10/intermediate/dense/mul_1', 'bert/encoder/layer_10/intermediate/dense/Tanh', 'bert/encoder/layer_10/intermediate/dense/add_1', 'bert/encoder/layer_10/intermediate/dense/mul_2', 'bert/encoder/layer_10/intermediate/dense/mul_3', 'bert/encoder/layer_10/output/dense/MatMul', 'bert/encoder/layer_10/output/dense/BiasAdd', 'bert/encoder/layer_10/output/add', 'bert/encoder/layer_10/output/LayerNorm/moments/mean', 'bert/encoder/layer_10/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_10/output/LayerNorm/moments/SquaredDifference__453', 'bert/encoder/layer_10/output/LayerNorm/moments/variance', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/Rsqrt__455', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_10/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_11/attention/self/value/MatMul', 'bert/encoder/layer_11/attention/self/value/BiasAdd', 'bert/encoder/layer_11/attention/self/Reshape_2', 'bert/encoder/layer_11/attention/self/transpose_2', 'bert/encoder/layer_11/attention/self/query/MatMul', 'bert/encoder/layer_11/attention/self/query/BiasAdd', 'bert/encoder/layer_11/attention/self/Reshape', 'bert/encoder/layer_11/attention/self/transpose', 'bert/encoder/layer_11/attention/self/key/MatMul', 'bert/encoder/layer_11/attention/self/key/BiasAdd', 'bert/encoder/layer_11/attention/self/Reshape_1', 'bert/encoder/layer_11/attention/self/MatMul__460', 'bert/encoder/layer_11/attention/self/MatMul', 'bert/encoder/layer_11/attention/self/Mul', 'bert/encoder/layer_11/attention/self/add', 'bert/encoder/layer_11/attention/self/Softmax', 'bert/encoder/layer_11/attention/self/MatMul_1', 'bert/encoder/layer_11/attention/self/transpose_3', 'bert/encoder/layer_11/attention/self/Reshape_3', 'bert/encoder/layer_11/attention/output/dense/MatMul', 'bert/encoder/layer_11/attention/output/dense/BiasAdd', 'bert/encoder/layer_11/attention/output/add', 'bert/encoder/layer_11/attention/output/LayerNorm/moments/mean', 'bert/encoder/layer_11/attention/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_11/attention/output/LayerNorm/moments/SquaredDifference__463', 'bert/encoder/layer_11/attention/output/LayerNorm/moments/variance', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/Rsqrt__465', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_11/attention/output/LayerNorm/batchnorm/add_1', 'bert/encoder/layer_11/intermediate/dense/MatMul', 'bert/encoder/layer_11/intermediate/dense/BiasAdd', 'bert/encoder/layer_11/intermediate/dense/Pow', 'bert/encoder/layer_11/intermediate/dense/mul', 'bert/encoder/layer_11/intermediate/dense/add', 'bert/encoder/layer_11/intermediate/dense/mul_1', 'bert/encoder/layer_11/intermediate/dense/Tanh', 'bert/encoder/layer_11/intermediate/dense/add_1', 'bert/encoder/layer_11/intermediate/dense/mul_2', 'bert/encoder/layer_11/intermediate/dense/mul_3', 'bert/encoder/layer_11/output/dense/MatMul', 'bert/encoder/layer_11/output/dense/BiasAdd', 'bert/encoder/layer_11/output/add', 'bert/encoder/layer_11/output/LayerNorm/moments/mean', 'bert/encoder/layer_11/output/LayerNorm/moments/SquaredDifference', 'bert/encoder/layer_11/output/LayerNorm/moments/SquaredDifference__467', 'bert/encoder/layer_11/output/LayerNorm/moments/variance', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/add', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/Rsqrt', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/Rsqrt__469', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/mul', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/mul_2', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/sub', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/mul_1', 'bert/encoder/layer_11/output/LayerNorm/batchnorm/add_1', 'MatMul', 'BiasAdd', 'Reshape_1', 'transpose', 'unstack', 'unstack__490', 'unstack__488']
Generating '/tmp/nsys-report-ae10.qdstrm'

[1/8] [0%                          ] nsys-report-0123.nsys-rep
[1/8] [0%                          ] nsys-report-0123.nsys-rep
[1/8] [5%                          ] nsys-report-0123.nsys-rep
[1/8] [6%                          ] nsys-report-0123.nsys-rep
[1/8] [11%                         ] nsys-report-0123.nsys-rep
[1/8] [9%                          ] nsys-report-0123.nsys-rep
[1/8] [8%                          ] nsys-report-0123.nsys-rep
[1/8] [6%                          ] nsys-report-0123.nsys-rep
[1/8] [====25%                     ] nsys-report-0123.nsys-rep
[1/8] [===22%                      ] nsys-report-0123.nsys-rep
[1/8] [==20%                       ] nsys-report-0123.nsys-rep
[1/8] [==18%                       ] nsys-report-0123.nsys-rep
[1/8] [=17%                        ] nsys-report-0123.nsys-rep
[1/8] [==19%                       ] nsys-report-0123.nsys-rep
[1/8] [==20%                       ] nsys-report-0123.nsys-rep
[1/8] [==21%                       ] nsys-report-0123.nsys-rep
[1/8] [===22%                      ] nsys-report-0123.nsys-rep
[1/8] [===23%                      ] nsys-report-0123.nsys-rep
[1/8] [===24%                      ] nsys-report-0123.nsys-rep
[1/8] [====25%                     ] nsys-report-0123.nsys-rep
[1/8] [====26%                     ] nsys-report-0123.nsys-rep
[1/8] [====28%                     ] nsys-report-0123.nsys-rep
[1/8] [=====30%                    ] nsys-report-0123.nsys-rep
[1/8] [======34%                   ] nsys-report-0123.nsys-rep
[1/8] [=======38%                  ] nsys-report-0123.nsys-rep
[1/8] [==========48%               ] nsys-report-0123.nsys-rep
[1/8] [============57%             ] nsys-report-0123.nsys-rep
[1/8] [===============67%          ] nsys-report-0123.nsys-rep
[1/8] [================71%         ] nsys-report-0123.nsys-rep
[1/8] [=================72%        ] nsys-report-0123.nsys-rep
[1/8] [==================76%       ] nsys-report-0123.nsys-rep
[1/8] [====================83%     ] nsys-report-0123.nsys-rep
[1/8] [====================84%     ] nsys-report-0123.nsys-rep
[1/8] [====================85%     ] nsys-report-0123.nsys-rep
[1/8] [=====================86%    ] nsys-report-0123.nsys-rep
[1/8] [=====================87%    ] nsys-report-0123.nsys-rep
[1/8] [=====================88%    ] nsys-report-0123.nsys-rep
[1/8] [========================97% ] nsys-report-0123.nsys-rep
[1/8] [========================98% ] nsys-report-0123.nsys-rep
[1/8] [========================100%] nsys-report-0123.nsys-rep
[1/8] [========================100%] nsys-report-0123.nsys-rep

[2/8] [0%                          ] nsys-report-2c90.sqlite
[2/8] [1%                          ] nsys-report-2c90.sqlite
[2/8] [2%                          ] nsys-report-2c90.sqlite
[2/8] [3%                          ] nsys-report-2c90.sqlite
[2/8] [4%                          ] nsys-report-2c90.sqlite
[2/8] [5%                          ] nsys-report-2c90.sqlite
[2/8] [6%                          ] nsys-report-2c90.sqlite
[2/8] [7%                          ] nsys-report-2c90.sqlite
[2/8] [8%                          ] nsys-report-2c90.sqlite
[2/8] [9%                          ] nsys-report-2c90.sqlite
[2/8] [10%                         ] nsys-report-2c90.sqlite
[2/8] [11%                         ] nsys-report-2c90.sqlite
[2/8] [12%                         ] nsys-report-2c90.sqlite
[2/8] [13%                         ] nsys-report-2c90.sqlite
[2/8] [14%                         ] nsys-report-2c90.sqlite
[2/8] [=15%                        ] nsys-report-2c90.sqlite
[2/8] [=16%                        ] nsys-report-2c90.sqlite
[2/8] [=17%                        ] nsys-report-2c90.sqlite
[2/8] [==18%                       ] nsys-report-2c90.sqlite
[2/8] [==19%                       ] nsys-report-2c90.sqlite
[2/8] [==20%                       ] nsys-report-2c90.sqlite
[2/8] [==21%                       ] nsys-report-2c90.sqlite
[2/8] [===22%                      ] nsys-report-2c90.sqlite
[2/8] [===23%                      ] nsys-report-2c90.sqlite
[2/8] [===24%                      ] nsys-report-2c90.sqlite
[2/8] [====25%                     ] nsys-report-2c90.sqlite
[2/8] [====26%                     ] nsys-report-2c90.sqlite
[2/8] [====27%                     ] nsys-report-2c90.sqlite
[2/8] [====28%                     ] nsys-report-2c90.sqlite
[2/8] [=====29%                    ] nsys-report-2c90.sqlite
[2/8] [=====30%                    ] nsys-report-2c90.sqlite
[2/8] [=====31%                    ] nsys-report-2c90.sqlite
[2/8] [=====32%                    ] nsys-report-2c90.sqlite
[2/8] [======33%                   ] nsys-report-2c90.sqlite
[2/8] [======34%                   ] nsys-report-2c90.sqlite
[2/8] [======35%                   ] nsys-report-2c90.sqlite
[2/8] [=======36%                  ] nsys-report-2c90.sqlite
[2/8] [=======37%                  ] nsys-report-2c90.sqlite
[2/8] [=======38%                  ] nsys-report-2c90.sqlite
[2/8] [=======39%                  ] nsys-report-2c90.sqlite
[2/8] [========40%                 ] nsys-report-2c90.sqlite
[2/8] [========41%                 ] nsys-report-2c90.sqlite
[2/8] [========42%                 ] nsys-report-2c90.sqlite
[2/8] [=========43%                ] nsys-report-2c90.sqlite
[2/8] [=========44%                ] nsys-report-2c90.sqlite
[2/8] [=========45%                ] nsys-report-2c90.sqlite
[2/8] [=========46%                ] nsys-report-2c90.sqlite
[2/8] [==========47%               ] nsys-report-2c90.sqlite
[2/8] [==========48%               ] nsys-report-2c90.sqlite
[2/8] [==========49%               ] nsys-report-2c90.sqlite
[2/8] [===========50%              ] nsys-report-2c90.sqlite
[2/8] [===========51%              ] nsys-report-2c90.sqlite
[2/8] [===========52%              ] nsys-report-2c90.sqlite
[2/8] [===========53%              ] nsys-report-2c90.sqlite
[2/8] [============54%             ] nsys-report-2c90.sqlite
[2/8] [============55%             ] nsys-report-2c90.sqlite
[2/8] [============56%             ] nsys-report-2c90.sqlite
[2/8] [============57%             ] nsys-report-2c90.sqlite
[2/8] [=============58%            ] nsys-report-2c90.sqlite
[2/8] [=============59%            ] nsys-report-2c90.sqlite
[2/8] [=============60%            ] nsys-report-2c90.sqlite
[2/8] [==============61%           ] nsys-report-2c90.sqlite
[2/8] [==============62%           ] nsys-report-2c90.sqlite
[2/8] [==============63%           ] nsys-report-2c90.sqlite
[2/8] [==============64%           ] nsys-report-2c90.sqlite
[2/8] [===============65%          ] nsys-report-2c90.sqlite
[2/8] [===============66%          ] nsys-report-2c90.sqlite
[2/8] [===============67%          ] nsys-report-2c90.sqlite
[2/8] [================68%         ] nsys-report-2c90.sqlite
[2/8] [================69%         ] nsys-report-2c90.sqlite
[2/8] [================70%         ] nsys-report-2c90.sqlite
[2/8] [================71%         ] nsys-report-2c90.sqlite
[2/8] [=================72%        ] nsys-report-2c90.sqlite
[2/8] [=================73%        ] nsys-report-2c90.sqlite
[2/8] [=================74%        ] nsys-report-2c90.sqlite
[2/8] [==================75%       ] nsys-report-2c90.sqlite
[2/8] [==================76%       ] nsys-report-2c90.sqlite
[2/8] [==================77%       ] nsys-report-2c90.sqlite
[2/8] [==================78%       ] nsys-report-2c90.sqlite
[2/8] [===================79%      ] nsys-report-2c90.sqlite
[2/8] [===================80%      ] nsys-report-2c90.sqlite
[2/8] [===================81%      ] nsys-report-2c90.sqlite
[2/8] [===================82%      ] nsys-report-2c90.sqlite
[2/8] [====================83%     ] nsys-report-2c90.sqlite
[2/8] [====================84%     ] nsys-report-2c90.sqlite
[2/8] [====================85%     ] nsys-report-2c90.sqlite
[2/8] [=====================86%    ] nsys-report-2c90.sqlite
[2/8] [=====================87%    ] nsys-report-2c90.sqlite
[2/8] [=====================88%    ] nsys-report-2c90.sqlite
[2/8] [=====================89%    ] nsys-report-2c90.sqlite
[2/8] [======================90%   ] nsys-report-2c90.sqlite
[2/8] [======================91%   ] nsys-report-2c90.sqlite
[2/8] [======================92%   ] nsys-report-2c90.sqlite
[2/8] [=======================93%  ] nsys-report-2c90.sqlite
[2/8] [=======================94%  ] nsys-report-2c90.sqlite
[2/8] [=======================95%  ] nsys-report-2c90.sqlite
[2/8] [=======================96%  ] nsys-report-2c90.sqlite
[2/8] [========================97% ] nsys-report-2c90.sqlite
[2/8] [========================98% ] nsys-report-2c90.sqlite
[2/8] [========================99% ] nsys-report-2c90.sqlite
[2/8] [========================100%] nsys-report-2c90.sqlite
[2/8] [========================100%] nsys-report-2c90.sqlite
[3/8] Executing 'nvtx_sum' stats report
[4/8] Executing 'osrt_sum' stats report

 Time (%)  Total Time (ns)  Num Calls    Avg (ns)       Med (ns)      Min (ns)     Max (ns)    StdDev (ns)            Name         
 --------  ---------------  ---------  -------------  -------------  -----------  -----------  ------------  ----------------------
     55.2    5,230,820,475         63   83,028,896.4  100,119,708.0        1,280  548,595,156  73,482,897.6  poll                  
     42.2    4,000,741,404          8  500,092,675.5  500,090,102.0  500,067,701  500,124,383      15,851.7  pthread_cond_timedwait
      1.7      158,295,782      5,642       28,056.7          740.0          290  146,028,133   1,944,088.5  read                  
      0.7       62,781,864      1,956       32,097.1        7,815.0          200   12,586,893     392,335.3  ioctl                 
      0.1        9,348,135      3,183        2,936.9        2,700.0        1,140       36,151       1,256.3  open64                
      0.1        5,061,988          1    5,061,988.0    5,061,988.0    5,061,988    5,061,988           0.0  nanosleep             
      0.0        3,496,378    131,629           26.6           20.0           20        4,590          38.7  pthread_cond_signal   
      0.0        3,011,948        138       21,825.7        5,005.0        2,410    1,589,005     135,795.1  mmap64                
      0.0        2,619,399         71       36,892.9          930.0          580      672,060     111,420.4  pread64               
      0.0          809,893         13       62,299.5       59,751.0       54,861       80,871       8,354.2  sleep                 
      0.0          515,469        583          884.2           50.0           20       60,211       5,485.3  fgets                 
      0.0          481,325         28       17,190.2        6,820.0        1,760      114,061      23,885.3  mmap                  
      0.0          377,505         10       37,750.5       35,975.5       15,230       78,841      20,343.3  sem_timedwait         
      0.0          348,995          8       43,624.4       31,290.0       25,720       72,221      20,452.4  pthread_create        
      0.0          221,624         29        7,642.2        2,700.0          490       52,390      12,795.7  write                 
      0.0          204,595         44        4,649.9        2,830.5          970       21,260       4,496.3  fopen                 
      0.0          174,132         10       17,413.2        4,120.0        1,940       79,041      29,324.5  munmap                
      0.0          130,762          1      130,762.0      130,762.0      130,762      130,762           0.0  pthread_cond_wait     
      0.0          101,411          1      101,411.0      101,411.0      101,411      101,411           0.0  waitpid               
      0.0           65,891         41        1,607.1        1,210.0          590        8,681       1,385.2  fclose                
      0.0           63,852         15        4,256.8        3,490.0        1,820       15,830       3,439.2  open                  
      0.0           55,884      1,622           34.5           30.0           20        4,880         142.1  pthread_cond_broadcast
      0.0           38,160          2       19,080.0       19,080.0        8,680       29,480      14,707.8  connect               
      0.0           33,752          6        5,625.3        5,155.5        2,960       10,060       2,994.9  pipe2                 
      0.0           28,160          4        7,040.0        7,230.0        3,130       10,570       4,082.4  socket                
      0.0           27,231        133          204.7          210.0           20        1,210         133.4  sigaction             
      0.0           26,250          6        4,375.0        4,350.0        2,040        8,610       2,359.6  fopen64               
      0.0           22,074         68          324.6          300.0          171        1,090         148.3  fcntl                 
      0.0           20,536        256           80.2          100.0           20          331          60.8  pthread_mutex_trylock 
      0.0           13,683        543           25.2           20.0           20          120           6.5  flockfile             
      0.0           13,500          3        4,500.0        4,280.0        1,530        7,690       3,085.9  fread                 
      0.0            7,440          2        3,720.0        3,720.0        1,510        5,930       3,125.4  bind                  
      0.0            3,279         30          109.3           30.0           20          880         189.0  fflush                
      0.0            2,620         10          262.0          255.0          220          350          37.9  dup                   
      0.0            2,579          2        1,289.5        1,289.5          949        1,630         481.5  fwrite                
      0.0            1,180          2          590.0          590.0          380          800         297.0  dup2                  
      0.0              890          1          890.0          890.0          890          890           0.0  getc                  
      0.0              650          1          650.0          650.0          650          650           0.0  listen                

[5/8] Executing 'cuda_api_sum' stats report

 Time (%)  Total Time (ns)  Num Calls   Avg (ns)     Med (ns)    Min (ns)    Max (ns)   StdDev (ns)                      Name                     
 --------  ---------------  ---------  -----------  -----------  ---------  ----------  -----------  ---------------------------------------------
     54.9      214,002,149      1,676    127,686.2     26,805.5      2,240   1,491,052    249,665.4  cudaMemcpyAsync                              
     21.6       84,145,482      1,676     50,206.1     10,940.0        580     273,074     70,961.6  cudaStreamSynchronize                        
     18.6       72,377,733        644    112,387.8      6,425.0      3,550  16,272,670    956,381.0  cudaLaunchKernel                             
      2.0        7,716,799          2  3,858,399.5  3,858,399.5  1,150,218   6,566,581  3,829,947.0  cudaFree                                     
      1.6        6,338,147          9    704,238.6      1,370.0        290   6,327,717  2,108,804.6  cudaStreamIsCapturing_v10000                 
      0.6        2,334,497         49     47,642.8     47,881.0     38,230      52,011      3,374.7  cuCtxSynchronize                             
      0.2          893,243          9     99,249.2    102,492.0      4,411     158,362     54,393.9  cudaMalloc                                   
      0.2          657,800         49     13,424.5     13,550.0      8,540      27,110      2,774.6  cuLaunchKernel                               
      0.1          318,881         62      5,143.2      3,580.0      2,800      16,960      2,706.9  cudaMemsetAsync                              
      0.1          303,263      1,532        198.0        180.0         50       4,640        173.3  cuGetProcAddress_v2                          
      0.1          218,475          2    109,237.5    109,237.5     73,382     145,093     50,707.3  cuModuleLoadData                             
      0.0          161,342          1    161,342.0    161,342.0    161,342     161,342          0.0  cudaGetDeviceProperties_v2_v12000            
      0.0           75,655         26      2,909.8      2,545.0        349       7,140      1,384.1  cudaOccupancyMaxActiveBlocksPerMultiprocessor
      0.0           18,840         18      1,046.7        265.0        140      10,740      2,478.0  cudaEventCreateWithFlags                     
      0.0            4,670          1      4,670.0      4,670.0      4,670       4,670          0.0  cuMemFree_v2                                 
      0.0            3,880          4        970.0        970.0        580       1,360        369.7  cuInit                                       
      0.0            1,240          1      1,240.0      1,240.0      1,240       1,240          0.0  cuCtxSetCurrent                              
      0.0            1,200          2        600.0        600.0        250         950        495.0  cudaGetDriverEntryPoint_v11030               
      0.0              560          4        140.0        115.0         80         250         76.2  cuModuleGetLoadingMode                       

[6/8] Executing 'cuda_gpu_kern_sum' stats report

 Time (%)  Total Time (ns)  Instances  Avg (ns)  Med (ns)  Min (ns)  Max (ns)  StdDev (ns)                                                  Name                                                
 --------  ---------------  ---------  --------  --------  --------  --------  -----------  ----------------------------------------------------------------------------------------------------
     48.6        2,147,879         49  43,834.3  43,744.0    43,680    45,728        299.6  cutlass_tensorop_f16_s16816gemm_f16_256x128_32x3_tt_align8                                          
      7.4          326,528         48   6,802.7   6,752.0     6,560     7,136        178.6  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      6.8          299,042         12  24,920.2  24,896.5    24,800    25,248        115.0  ampere_fp16_s16816gemm_fp16_128x64_ldg8_f2f_stages_64x4_nn                                          
      4.6          204,162         72   2,835.6   2,880.0     2,336     3,424        376.7  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      4.0          175,617         48   3,658.7   3,744.0     3,392     3,904        226.8  void at::native::reduce_kernel<(int)512, (int)1, at::native::ReduceOp<c10::Half, at::native::MeanOp…
      3.3          143,967         12  11,997.3  11,952.0    11,872    12,543        175.8  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      2.9          126,912         24   5,288.0   5,328.0     4,832     5,696        281.0  void cutlass::Kernel<cutlass_80_wmma_tensorop_f16_s161616gemm_f16_32x32_64x1_nn_align8>(T1::Params) 
      2.3          102,977         48   2,145.4   2,176.0     1,984     2,368        134.6  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      2.3          100,766         36   2,799.1   2,240.0     2,207     4,000        821.3  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      2.2           96,065         12   8,005.4   8,032.0     7,713     8,161        109.7  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      2.1           94,881         12   7,906.8   7,904.0     7,809     7,999         53.2  ampere_fp16_s16816gemm_fp16_64x64_ldg8_f2f_stages_64x5_nn                                           
      1.9           84,862         12   7,071.8   7,103.0     6,752     7,168        110.8  void at::native::elementwise_kernel<(int)128, (int)4, void at::native::gpu_kernel_impl<at::native::…
      1.5           65,501         36   1,819.5   1,472.0     1,440     2,560        493.3  void at::native::vectorized_elementwise_kernel<(int)4, at::native::CUDAFunctor_add<c10::Half>, at::…
      1.5           65,190         36   1,810.8   1,473.0     1,409     2,528        495.9  void at::native::vectorized_elementwise_kernel<(int)4, at::native::BinaryFunctor<c10::Half, c10::Ha…
      1.5           64,862         27   2,402.3   2,432.0     1,824     2,560        131.9  void at::native::elementwise_kernel<(int)128, (int)2, void at::native::gpu_kernel_impl<at::native::…
      1.5           64,159         12   5,346.6   5,280.0     5,216     6,016        218.5  void <unnamed>::softmax_warp_forward<float, float, float, (int)8, (bool)0, (bool)0>(T2 *, const T1 …
      1.4           61,535         49   1,255.8   1,088.0       800     1,888        340.6  void at::native::vectorized_elementwise_kernel<(int)4, at::native::FillFunctor<c10::Half>, at::deta…
      0.7           31,456         24   1,310.7   1,312.0     1,216     1,344         22.0  void at::native::unrolled_elementwise_kernel<at::native::CUDAFunctor_add<c10::Half>, at::detail::Ar…
      0.7           30,944         12   2,578.7   2,528.0     2,528     3,040        146.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::tanh_kernel_cuda(at::TensorItera…
      0.6           27,519         27   1,019.2     992.0       928     1,824        161.4  void at::native::vectorized_elementwise_kernel<(int)4, at::native::CUDAFunctor_add<float>, at::deta…
      0.6           27,229         24   1,134.5   1,151.0     1,056     1,153         23.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::reciprocal_kernel_cuda(at::Tenso…
      0.6           26,368         24   1,098.7   1,088.0     1,056     1,120         18.1  void at::native::vectorized_elementwise_kernel<(int)4, at::native::sqrt_kernel_cuda(at::TensorItera…
      0.5           22,369         24     932.0     928.0       896       960         14.3  void at::native::vectorized_elementwise_kernel<(int)4, at::native::AUnaryFunctor<c10::Half, c10::Ha…
      0.2            9,408          5   1,881.6   1,632.0     1,344     2,464        527.5  void at::native::elementwise_kernel<(int)128, (int)2, void at::native::gpu_kernel_impl<at::native::…
      0.2            7,808          2   3,904.0   3,904.0     3,744     4,064        226.3  void at::native::reduce_kernel<(int)512, (int)1, at::native::ReduceOp<float, at::native::MeanOps<fl…
      0.1            2,753          1   2,753.0   2,753.0     2,753     2,753          0.0  void at::native::unrolled_elementwise_kernel<at::native::CUDAFunctor_add<float>, at::detail::Array<…
      0.0            2,144          1   2,144.0   2,144.0     2,144     2,144          0.0  void cutlass::Kernel<cutlass_80_wmma_tensorop_f16_s161616gemm_f16_32x32_32x1_nn_align2>(T1::Params) 
      0.0            1,856          1   1,856.0   1,856.0     1,856     1,856          0.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::BinaryFunctor<float, float, floa…
      0.0            1,024          1   1,024.0   1,024.0     1,024     1,024          0.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::reciprocal_kernel_cuda(at::Tenso…
      0.0              991          1     991.0     991.0       991       991          0.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::sqrt_kernel_cuda(at::TensorItera…
      0.0              928          1     928.0     928.0       928       928          0.0  void at::native::vectorized_elementwise_kernel<(int)4, at::native::AUnaryFunctor<float, float, floa…

[7/8] Executing 'cuda_gpu_mem_time_sum' stats report

 Time (%)  Total Time (ns)  Count  Avg (ns)   Med (ns)  Min (ns)  Max (ns)  StdDev (ns)           Operation          
 --------  ---------------  -----  ---------  --------  --------  --------  -----------  ----------------------------
     59.5       96,180,030  1,092   88,077.0  61,152.0       287   754,659    134,057.1  [CUDA memcpy Host-to-Device]
     40.5       65,543,200    584  112,231.5  59,296.0       960   809,219    154,864.6  [CUDA memcpy Device-to-Host]
      0.0           29,726     62      479.5     320.0       288       928        232.2  [CUDA memset]               

[8/8] Executing 'cuda_gpu_mem_size_sum' stats report

 Total (MB)  Count  Avg (MB)  Med (MB)  Min (MB)  Max (MB)  StdDev (MB)           Operation          
 ----------  -----  --------  --------  --------  --------  -----------  ----------------------------
    626.375  1,092     0.574     0.393     0.000     4.719        0.881  [CUDA memcpy Host-to-Device]
    393.055    584     0.673     0.393     0.000     3.146        0.785  [CUDA memcpy Device-to-Host]
      0.001     62     0.000     0.000     0.000     0.000        0.000  [CUDA memset]               

Generated:
    /tmp/nsys-report-0123.nsys-rep
    /tmp/nsys-report-2c90.sqlite
Editor is loading...
Leave a Comment