cutlass_separate_super_resolution_profiling
Results for input size (1, 1, 224, 224) Average total time: 23.0771 ms (±2.3151 ms) Per-node timing: Unnamed_Conv_0: 2.6723 ms (±0.6407 ms) Unnamed_Relu_1: 3.3574 ms (±0.6628 ms) Unnamed_Conv_2: 4.7135 ms (±0.7869 ms) Unnamed_Relu_3: 3.4068 ms (±0.7140 ms) Unnamed_Conv_4: 4.0401 ms (±0.6874 ms) Unnamed_Relu_5: 1.8877 ms (±0.4473 ms) Unnamed_Conv_6: 2.3457 ms (±0.5206 ms) Unnamed_Reshape_7: 0.0460 ms (±0.0230 ms) Unnamed_Transpose_8: 0.0414 ms (±0.0257 ms) Unnamed_Reshape_9: 0.5663 ms (±0.2184 ms) ================================================== Results for input size (1, 1, 448, 448) Average total time: 87.7230 ms (±7.8746 ms) Per-node timing: Unnamed_Conv_0: 10.5773 ms (±2.6923 ms) Unnamed_Relu_1: 15.8003 ms (±2.6695 ms) Unnamed_Conv_2: 18.8260 ms (±2.5737 ms) Unnamed_Relu_3: 15.6897 ms (±2.2091 ms) Unnamed_Conv_4: 12.1397 ms (±1.4591 ms) Unnamed_Relu_5: 6.6754 ms (±1.1668 ms) Unnamed_Conv_6: 5.8848 ms (±0.5969 ms) Unnamed_Reshape_7: 0.0392 ms (±0.0115 ms) Unnamed_Transpose_8: 0.0374 ms (±0.0084 ms) Unnamed_Reshape_9: 2.0531 ms (±0.5780 ms) ================================================== Results for input size (1, 1, 896, 896) Average total time: 337.9429 ms (±19.8082 ms) Per-node timing: Unnamed_Conv_0: 39.2193 ms (±4.4599 ms) Unnamed_Relu_1: 59.0536 ms (±5.3124 ms) Unnamed_Conv_2: 67.8334 ms (±6.2601 ms) Unnamed_Relu_3: 61.3133 ms (±6.9191 ms) Unnamed_Conv_4: 52.0059 ms (±6.6856 ms) Unnamed_Relu_5: 29.9243 ms (±3.2900 ms) Unnamed_Conv_6: 20.7833 ms (±1.4305 ms) Unnamed_Reshape_7: 0.0445 ms (±0.0172 ms) Unnamed_Transpose_8: 0.0383 ms (±0.0114 ms) Unnamed_Reshape_9: 7.7271 ms (±0.9925 ms) ================================================== Results for input size (4, 1, 224, 224) Average total time: 86.3037 ms (±5.2722 ms) Per-node timing: Unnamed_Conv_0: 10.7318 ms (±1.6005 ms) Unnamed_Relu_1: 15.3604 ms (±1.5316 ms) Unnamed_Conv_2: 18.2059 ms (±1.9071 ms) Unnamed_Relu_3: 15.6256 ms (±1.3809 ms) Unnamed_Conv_4: 12.0647 ms (±1.8428 ms) Unnamed_Relu_5: 6.3232 ms (±0.8901 ms) Unnamed_Conv_6: 5.8941 ms (±0.7539 ms) Unnamed_Reshape_7: 0.0446 ms (±0.0214 ms) Unnamed_Transpose_8: 0.0387 ms (±0.0131 ms) Unnamed_Reshape_9: 2.0147 ms (±0.3522 ms) ================================================== Results for input size (8, 1, 224, 224) Average total time: 171.2700 ms (±10.7753 ms) Per-node timing: Unnamed_Conv_0: 20.3224 ms (±3.0205 ms) Unnamed_Relu_1: 30.5993 ms (±2.7122 ms) Unnamed_Conv_2: 34.6179 ms (±4.0315 ms) Unnamed_Relu_3: 29.8182 ms (±4.0667 ms) Unnamed_Conv_4: 25.9873 ms (±2.4558 ms) Unnamed_Relu_5: 15.0954 ms (±1.2819 ms) Unnamed_Conv_6: 10.8527 ms (±0.8927 ms) Unnamed_Reshape_7: 0.0438 ms (±0.0152 ms) Unnamed_Transpose_8: 0.0408 ms (±0.0119 ms) Unnamed_Reshape_9: 3.8922 ms (±0.6902 ms) ================================================== Results for input size (16, 1, 224, 224) Average total time: 343.4146 ms (±23.1273 ms) Per-node timing: Unnamed_Conv_0: 40.3206 ms (±6.7142 ms) Unnamed_Relu_1: 61.3990 ms (±8.4593 ms) Unnamed_Conv_2: 68.4163 ms (±7.2097 ms) Unnamed_Relu_3: 62.5510 ms (±7.6317 ms) Unnamed_Conv_4: 50.3852 ms (±4.0626 ms) Unnamed_Relu_5: 30.9405 ms (±5.3879 ms) Unnamed_Conv_6: 21.5293 ms (±3.5616 ms) Unnamed_Reshape_7: 0.0448 ms (±0.0162 ms) Unnamed_Transpose_8: 0.0403 ms (±0.0149 ms) Unnamed_Reshape_9: 7.7876 ms (±1.1069 ms) ==================================================
Leave a Comment