Dropout node helps to reduce the over fitting in neural networks by preventing complex co-adaptations on training data.

This layer “drops out” or ignore a random set of neurons. As a result, any weights related to these neurons will not be updated during the training process. The effect is that the network becomes less sensitive to the specific weights of neurons. This effect results in a network that is capable of better generalization and is less likely to over-fit the training data



Defines probability of randomly dropping out the neuron in the layer.

Default value is 0.25

A probability too low has minimal effect and a value too high results in under-learning by the network.


  • You are likely to get better performance when dropout is used on a larger network, giving the model more of an opportunity to learn independent representations.
  • Use a large learning rate with decay and a large momentum. Increase your learning rate by a factor of 10 to 100 and use a high momentum value of 0.9 or 0.99.