cutlass_warmup_separete_profiling_results

 avatar
user_3093867
plain_text
12 days ago
5.4 kB
5
Indexable
Running for input size: (1, 3, 416, 416)
Average total time over 100 runs for each input size: 337.9050 ms
Node 'Mul': 0.1551 ms
Node 'convolution': 0.5733 ms
Node 'activation': 0.5342 ms
Node 'pooling': 0.3352 ms
Node 'convolution1': 0.3765 ms
Node 'activation1': 0.2801 ms
Node 'pooling1': 0.1993 ms
Node 'convolution2': 0.2840 ms
Node 'activation2': 0.1576 ms
Node 'pooling2': 0.1192 ms
Node 'convolution3': 0.2320 ms
Node 'activation3': 0.1064 ms
Node 'pooling3': 0.0812 ms
Node 'convolution4': 0.3795 ms
Node 'activation4': 0.0735 ms
Node 'pooling4': 0.0595 ms
Node 'convolution5': 0.3166 ms
Node 'activation5': 0.0532 ms
Node 'pooling5': 0.0636 ms
Node 'convolution6': 0.6744 ms
Node 'activation6': 0.0763 ms
Node 'convolution7': 1.1853 ms
Node 'activation7': 0.0815 ms
Node 'convolution8': 0.2243 ms

====================================================
Running for input size: (1, 3, 832, 832)
Average total time over 100 runs for each input size: 1075.5562 ms
Node 'Mul': 0.5912 ms
Node 'convolution': 2.5961 ms
Node 'activation': 2.7474 ms
Node 'pooling': 1.5055 ms
Node 'convolution1': 1.4064 ms
Node 'activation1': 1.2594 ms
Node 'pooling1': 0.8148 ms
Node 'convolution2': 0.8713 ms
Node 'activation2': 0.6582 ms
Node 'pooling2': 0.4622 ms
Node 'convolution3': 0.6529 ms
Node 'activation3': 0.3833 ms
Node 'pooling3': 0.2822 ms
Node 'convolution4': 0.7199 ms
Node 'activation4': 0.2361 ms
Node 'pooling4': 0.1837 ms
Node 'convolution5': 0.6882 ms
Node 'activation5': 0.1620 ms
Node 'pooling5': 0.1901 ms
Node 'convolution6': 1.4530 ms
Node 'activation6': 0.2600 ms
Node 'convolution7': 2.5117 ms
Node 'activation7': 0.2507 ms
Node 'convolution8': 0.5070 ms

====================================================
Running for input size: (1, 3, 1664, 1664)
Average total time over 100 runs for each input size: 3628.5349 ms
Node 'Mul': 2.1358 ms
Node 'convolution': 10.6580 ms
Node 'activation': 11.9717 ms
Node 'pooling': 6.5254 ms
Node 'convolution1': 5.7377 ms
Node 'activation1': 5.8658 ms
Node 'pooling1': 3.1068 ms
Node 'convolution2': 2.8133 ms
Node 'activation2': 2.7624 ms
Node 'pooling2': 1.6426 ms
Node 'convolution3': 1.6196 ms
Node 'activation3': 1.3315 ms
Node 'pooling3': 0.9019 ms
Node 'convolution4': 1.3671 ms
Node 'activation4': 0.7316 ms
Node 'pooling4': 0.5176 ms
Node 'convolution5': 1.2895 ms
Node 'activation5': 0.4364 ms
Node 'pooling5': 0.4828 ms
Node 'convolution6': 2.6588 ms
Node 'activation6': 0.7528 ms
Node 'convolution7': 4.5647 ms
Node 'activation7': 0.7528 ms
Node 'convolution8': 0.9994 ms

====================================================
Running for input size: (5, 3, 416, 416)
Average total time over 100 runs for each input size: 4476.7466 ms
Node 'Mul': 2.6608 ms
Node 'convolution': 12.5797 ms
Node 'activation': 14.3026 ms
Node 'pooling': 8.0154 ms
Node 'convolution1': 6.9119 ms
Node 'activation1': 7.0723 ms
Node 'pooling1': 3.9030 ms
Node 'convolution2': 3.5400 ms
Node 'activation2': 3.4052 ms
Node 'pooling2': 2.0742 ms
Node 'convolution3': 2.0819 ms
Node 'activation3': 1.6766 ms
Node 'pooling3': 1.1479 ms
Node 'convolution4': 1.7612 ms
Node 'activation4': 0.9313 ms
Node 'pooling4': 0.6613 ms
Node 'convolution5': 1.6853 ms
Node 'activation5': 0.5571 ms
Node 'pooling5': 0.6227 ms
Node 'convolution6': 3.4477 ms
Node 'activation6': 0.9554 ms
Node 'convolution7': 5.9221 ms
Node 'activation7': 0.9534 ms
Node 'convolution8': 1.3097 ms

====================================================
Running for input size: (10, 3, 416, 416)
Average total time over 100 runs for each input size: 6064.1758 ms
Node 'Mul': 3.5691 ms
Node 'convolution': 17.3962 ms
Node 'activation': 19.8791 ms
Node 'pooling': 10.8230 ms
Node 'convolution1': 9.4174 ms
Node 'activation1': 9.7821 ms
Node 'pooling1': 5.3681 ms
Node 'convolution2': 4.6853 ms
Node 'activation2': 4.6296 ms
Node 'pooling2': 2.8781 ms
Node 'convolution3': 2.7989 ms
Node 'activation3': 2.3176 ms
Node 'pooling3': 1.5734 ms
Node 'convolution4': 2.2970 ms
Node 'activation4': 1.2913 ms
Node 'pooling4': 0.9080 ms
Node 'convolution5': 2.1625 ms
Node 'activation5': 0.7547 ms
Node 'pooling5': 0.8421 ms
Node 'convolution6': 4.4564 ms
Node 'activation6': 1.2981 ms
Node 'convolution7': 7.5987 ms
Node 'activation7': 1.2834 ms
Node 'convolution8': 1.7039 ms

====================================================
Running for input size: (15, 3, 416, 416)
Average total time over 100 runs for each input size: 8429.7783 ms
Node 'Mul': 4.9464 ms
Node 'convolution': 24.6652 ms
Node 'activation': 28.3661 ms
Node 'pooling': 15.5010 ms
Node 'convolution1': 13.3157 ms
Node 'activation1': 13.9849 ms
Node 'pooling1': 7.5245 ms
Node 'convolution2': 6.8280 ms
Node 'activation2': 6.5654 ms
Node 'pooling2': 3.9953 ms
Node 'convolution3': 3.7297 ms
Node 'activation3': 3.2070 ms
Node 'pooling3': 2.1619 ms
Node 'convolution4': 2.9553 ms
Node 'activation4': 1.7590 ms
Node 'pooling4': 1.2257 ms
Node 'convolution5': 2.7248 ms
Node 'activation5': 1.0218 ms
Node 'pooling5': 1.1219 ms
Node 'convolution6': 5.6751 ms
Node 'activation6': 1.7566 ms
Node 'convolution7': 9.5799 ms
Node 'activation7': 1.7722 ms
Node 'convolution8': 2.1868 ms

====================================================
Leave a Comment