make quantized_max_pool2d_nhwc handle case of C>64 (#19238)#19238
make quantized_max_pool2d_nhwc handle case of C>64 (#19238)#19238wl1026sun wants to merge 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19238
Note: Links to docs will display an error until the docs builds have been completed.
|
|
@wl1026sun has exported this pull request. If you are a Meta employee, you can view the originating Diff in D103096179. |
This PR needs a
|
b90b2ec to
835e67c
Compare
Summary: now the TIE quantized_max_pool2d_nhwc general path processes channels in chunks of 16 groups (64 bytes) at a time using a fixed stack array with an outer loop. This supports arbitrary C (any multiple of 4). Also adds test cases for C=128, C=256, k=3x3, and padding to cover all TIE kernel dispatch paths. Reviewed By: khazaei Differential Revision: D103096179
835e67c to
1c24640
Compare
Summary: now the TIE quantized_max_pool2d_nhwc general path processes channels in chunks of 16 groups (64 bytes) at a time using a fixed stack array with an outer loop. This supports arbitrary C (any multiple of 4). Also adds test cases for C=128, C=256, k=3x3, and padding to cover all TIE kernel dispatch paths. Reviewed By: khazaei Differential Revision: D103096179
1c24640 to
eb3a4de
Compare
eb3a4de to
4eaa763
Compare
Summary: now the TIE quantized_max_pool2d_nhwc general path processes channels in chunks of 16 groups (64 bytes) at a time using a fixed stack array with an outer loop. This supports arbitrary C (any multiple of 4). Also adds test cases for C=128, C=256, k=3x3, and padding to cover all TIE kernel dispatch paths. Reviewed By: khazaei Differential Revision: D103096179
Summary: now the TIE quantized_max_pool2d_nhwc general path processes channels in chunks of 16 groups (64 bytes) at a time using a fixed stack array with an outer loop. This supports arbitrary C (any multiple of 4). Also adds test cases for C=128, C=256, k=3x3, and padding to cover all TIE kernel dispatch paths. Reviewed By: khazaei Differential Revision: D103096179
331e086 to
cf61c03
Compare
Summary:
now the TIE quantized_max_pool2d_nhwc general path processes channels in chunks of 16 groups (64 bytes) at a time using a fixed stack array with an outer loop. This supports arbitrary C (any multiple of 4).
Also adds test cases for C=128, C=256, k=3x3, and padding to cover all TIE kernel dispatch paths.
Reviewed By: khazaei
Differential Revision: D103096179