Self-Attention and TransformerI am going to talk about Attention, Self-Attention, and Transformer one by one.Apr 7, 2023Apr 7, 2023
Backpropagation for Batch NormalizationI’ve never looked into the details of computing backprop because I think its idea is pretty straightforward while the process of…Mar 19, 2023Mar 19, 2023
Virtual MemoryWe want to achieve a system that can allow multiple processes to run together and achieve the following goals in accessing memory:Feb 26, 2023Feb 26, 2023
Activation FunctionsHere are many activation functions. The main goal of activation functions is to add non-linearity to the network.Feb 9, 2023Feb 9, 2023