Fork me on GitHub

神经网络中的正则化

Adding regularization will often help To prevent overfitting problem (high variance problem ).

1. Logistic regression

回忆一下训练时的优化目标函数

其中

$L_2 \ \ regularization $ (most commonly used):

其中

Why do we regularize just the parameter w? Because w Is usually a high dimensional parameter vector while b is A scalar. Almost all The parameters are in w rather than b.
$L_1 \ \ regularization $

其中

w will end up being sparse. In other words the w vector will have a lot of zeros in it. This can help with compressing the model a little.

2. Neural network “Frobenius norm”

其中

$L_2$ regulation is also called Weight decay:

能够防止权重$w$过大,从而避免过拟合

3. inverted dropout

对于不同的训练样本都可以随机消除一部分结点
反向随机失活(前向和后向都需要dropout):

this inverted dropout technique by dividing by the keep.prob, it ensures that the expected value of a3 remains the same. This makes test time easier because you have less of a scaling problem.
测试时不需要使用drop out