layer_norm
layer_norm_xformers(output_ptr, a_ptr, weight_ptr, bias_ptr, mean_ptr, rstd_ptr, output_row_stride, output_col_stride, a_row_stride, a_col_stride, N_SIZE, eps, HAS_BIAS, IS_RMSNORM, BLOCK_N_SIZE)
¶
LayerNorm forward pass for a single feature. Requires that a whole row of X is loaded into shared memory -> won't work for large tensors. based on: https://github.com/facebookresearch/xformers/blob/main/xformers/triton/k_layer_norm.py (arg names modified to match other implementation) -> only used in benchmarks
pytorch_naive_layernorm(a, weight, bias, eps)
¶
Naive implementation of layer norm in PyTorch -> only used in benchmarks
pytorch_naive_rmsnorm(a, weight, eps)
¶
Naive implementation of rmsnorm in PyTorch. Basically it's a layernorm without bias and subtraction of mean. Implementation follows HF one: https://github.com/huggingface/transformers/blob/d92e22d1f28324f513f3080e5c47c071a3916721/src/transformers/models/t5/modeling_t5.py#L239 Paper: https://arxiv.org/pdf/1910.07467.pdf -> only used in benchmarks