Neural Network Foundations神经网络基础
The core mental model: a neuron computes a weighted sum, a boundary separates classes, and hidden units create features that make harder patterns separable.核心理解:神经元先做加权求和,分界线负责分类,隐藏层会创造新特征,让原本难分的问题变得可分。
A perceptron is a linear classifier感知机就是线性分类器
For two inputs, the score is a weighted sum. The sign of the score decides the class.两个输入时,模型先算一个加权分数,再根据正负决定类别。
score = w0 + w1*x1 + w2*x2
if score > 0: predict +1
else: predict -1
w0 is the bias. It shifts the decision boundary. w1 and w2 control the direction of the boundary.w0 是 bias,用来移动分界线;w1 和 w2 控制分界线方向。
Only wrong predictions update the weights只有预测错了才更新权重
if target = +1 and prediction is wrong:
w = w + learning_rate * input
if target = -1 and prediction is wrong:
w = w - learning_rate * input
A full pass with no updates means all training examples are currently classified correctly.如果完整过一轮都没有更新,说明当前权重已经能正确分类这些训练样本。
OR and AND can be implemented by perceptronsOR 和 AND 都可以用感知机表示
- OR: all input weights are 1, bias is -0.5.所有输入权重为 1,bias 为 -0.5。
- AND: all input weights are 1, bias is 0.5 - n for n inputs.所有输入权重为 1,n 个输入时 bias 为 0.5 - n。
- CNF idea: one hidden unit per OR clause, then one output unit computes AND across clauses.每个 OR 子句对应一个隐藏节点,最后输出节点把这些子句做 AND。
XOR needs a hidden layerXOR 需要隐藏层
XOR outputs 1 when two inputs are different and 0 when they are the same. A single straight line cannot separate its positive and negative points.XOR 在两个输入不同的时候输出 1,相同时输出 0。单条直线无法把它的正负样本分开。
h1 = x1 OR x2
h2 = x1 AND x2
output = h1 AND NOT h2
The hidden units create useful features. The output layer then combines those features linearly.隐藏层创造了有用的新特征,输出层再线性组合这些特征。
Stacked linear layers collapse into one linear layer多层线性层可以合并成一层
If every activation is linear, then a deep network is still only a linear function of the input. Nonlinear activations are what make depth useful.如果所有激活函数都是线性的,那么多层网络本质上仍然只是输入的一个线性函数。非线性激活函数才让深度有意义。
How to check whether a point is classified correctly如何检查一个点是否分类正确
For a learned rule score = -2 + x1 + 2*x2, plug each point into the score. A positive score means the positive class, and a negative score means the negative class.对于规则 score = -2 + x1 + 2*x2,把点代进去算分数。分数为正就是正类,分数为负就是负类。
point (0, 1): score = -2 + 0 + 2 = 0 boundary
point (2, 1): score = -2 + 2 + 2 = 2 positive
point (1, 0): score = -2 + 1 + 0 = -1 negative
Nonlinearity is the point of hidden layers非线性才是隐藏层有用的关键
- Step: turns a score into a hard class decision.把分数变成硬分类结果。
- Sigmoid: squashes values into 0 to 1, useful for binary probability outputs.把数压到 0 到 1,常用于二分类概率输出。
- tanh: squashes values into -1 to 1, often used in small hidden layers or recurrent models.把数压到 -1 到 1,小型隐藏层或循环模型里常见。
- ReLU: returns 0 for negative inputs and returns the input for positive inputs.负数输出 0,正数直接输出原值。
Mini exercise for this page本页小练习
Create a perceptron that outputs 1 only when both binary inputs are 1. What bias should it use for two inputs?设计一个感知机:只有两个二进制输入都为 1 时才输出 1。两个输入时 bias 应该是多少?
Answer: Use weights 1 and 1, bias -1.5. Only 1 + 1 - 1.5 = 0.5 is positive.权重都设为 1,bias 设为 -1.5。只有 1 + 1 - 1.5 = 0.5 是正数。