Standard notation as it is used within deep learning, has changed a lot since the first published works. It is undergoing some standardization, but mostly at an informal level.
Notation
General
training
Superscript like denotes the iᵗʰ training example in a trainingset
layer
Superscript like denotes the lᵗʰ layer in a set of layers
sequence
Superscript like denotes the tᵗʰ item in a sequence of items
1D node
Subscript like denotes the iᵗʰ node in a one-dimensional layer
2D node
Subscript or like or denotes the node at iᵗʰ row and jᵗʰ column in a two-dimensional layer[note 1]
1D weight
Subscript or like or denotes the weight between node iᵗʰ at previous layer and jᵗʰ at following layer[note 2]
cross entropy
elementwise sequence loss
and by using cross entropy that is the sum would be over