Deep Learning - MCQ Practice Questions
Neural networks, CNN, RNN, activation functions & backpropagation.
10 questions | 100% Free
In a neural network, the vanishing gradient problem is MOST commonly associated with which activation function?
Which of the following best describes the role of the 'stride' parameter in a Convolutional Neural Network (CNN)?
In the context of Batch Normalization, what is normalized during the forward pass of training?
What is the primary purpose of the 'dropout' regularization technique in deep learning?
Which optimizer uses both the first moment (mean) and the second moment (uncentered variance) of gradients to adapt the learning rate for each parameter?
In an LSTM (Long Short-Term Memory) network, which gate is responsible for deciding what information to discard from the cell state?
The output size of a convolutional layer is given by , where is the input size, is padding, is filter size, and is stride. For an input of size , filter , padding , and stride , what is the output size?
Which of the following is the key architectural difference that distinguishes a Transformer model from a traditional RNN?
In transfer learning for deep neural networks, which approach is most commonly used when the new dataset is small but similar to the original dataset?
The cross-entropy loss for a multi-class classification problem with classes is defined as . If the true label is class 2 (one-hot: ) and the predicted probabilities are , what is the loss?