This is a PyTorch implementation of Layer Normalization.
Layer normalization is a simpler normalization method that works on a wider range of settings. Layer normalization transforms the inputs to have zero mean and unit variance across the features. Note that batch normalization fixes the zero mean and unit variance for each element. Layer normalization does it for each batch across all elements.
Layer normalization is generally used for NLP tasks.
We have used layer normalization in most of the transformer implementations.