Adds subgroup functionality to WebGPU. Subgroup operations perform SIMT operations to provide efficient communication and data sharing among groups of invocations. These operations can be used to accelerate applications by reducing memory overheads incurred by inter-invocation communication.
Subgroup operations can provide significant performance advantages for many algorithms from sorting to ML. They provide efficient communication and data sharing between invocations in a subgroup (generally between 4 and 64 invocations). Work dispatches are divided hierarchically into subgroups (e.g. a workgroup is divided into multiple subgroups). Each of the underlying APIs used to implement WebGPU provides a common subset of functionality that can be exposed to users.
Explainers: https://github.com/gpuweb/gpuweb/blob/main/proposals/subgroups.md