The Back-propagation Algorithm is ?

Natpatsorn THK.

5 min readAug 23, 2020

Back-propagation Algorithm ซึ่งเป็นกระบวนสำคัญในการย้อนกลับ (Backward) ไปปรับ Parameter (Weight และ Bias) ของ Model

Back-propagation เริ่มจากเราต้องการหา ค่า Error ที่คำนวณได้จาก output ของ Neural Network นำมาเปรียบเทียบกับ target ที่เราคาดหวังไว้ เมื่อได้ค่า Error ก็จะ แพร่ค่า Error ที่ได้กลับไปยังผู้มีส่วนเกี่ยวข้อง ในที่นี้คือ ค่า weight ต่างๆ Weight ใดให้ค่าน้ำหนักมา ก็จะได้รับผลกระทบของการปรับไปมาก Weight ใดให้ค่าน้อยก็จะได้รับผลกระทบ ไปน้อยๆ เช่นกัน.

- Forward Propagation …….

Input Layer

จากภาพด้านบน Node ที่มีสีเขียว คือ Input Data (Xi) ที่จะเป็นข้อมูลแบบ Scalare, Vector หรือ Matrix

Xi, i ∈ 1, 2

Output Layer

Neural Network คือ Output Layer ที่เป็น Node สีส้ม และ Node สีแดง

Z = W * X + B

ในการ Implement เราจะแทน Weight (W) ด้วย Matrix ขนาด (m, n) โดย m คือจำนวน Output Node (1 Node) และ n คือ จำนวน Input Node (2 Node)

W = [w1 w2]

โดยที่ B จะมีขนาด (m, 1)

B = [b]

สามารถหาค่า z ตามสมการด้านล่าง

z = [w1*x1 + w2*x2] + [b]   = [0.2*0.05 + 0.5*0.1] + [0.3]   = 0.36

- Sigmoid Function

จะทำให้ค่านอยู่ในช่วง 0–1 ซึ่งเราเรียกฟังก์ชันสำหรับการปรับค่าอย่างนี้ว่า Activate Function

ผลลัพธ์สุดท้ายที่เป็นค่าที่ Model ทำนายออกมาได้ หรือ y^ จะเท่ากับ Sigmoid(z)

y^ = Sigmoid(z) = Sigmoid(0.36) = 0.5890

- Loss Function

การคำนวน Error ว่า y^ ที่โมเดลทำนายออกมา ต่างจาก y ของจริง อยู่เท่าไร แล้วหาค่าเฉลี่ย เพื่อที่จะนำมาหา ความชัน(Gradient) ของ Loss ขึ้นกับ Weight ต่าง ๆ ด้วย Back-propagation

(y hat, y) คือ ข้อมูลที่เราจะนำมาใช้คำนวนผ่าน Loss Function ว่าโมเดลทำงานผิดพลาดมากน้อยแค่ไหน ถ้า Loss = 0 คือ ไม่ผิดพลาดเลย

สมมติว่า y เท่ากับ 0.7 และ L คือ MSE ดังนั้น L จะมีค่าเท่ากับ 0.0123

L = (y - y^)^2 = (0.7 - 0.589)^2 = 0.0123

- Back-propagation

เราสามารถทำกระบวนการ Forward Propagation เพื่อปรับค่า w1 จากการหาอนุพันธ์ของ L เทียบกับ w1 หรือความชัน (Gradient) ของ Loss(y, y^) หรือ Error ที่ w1

โดย กำหนดให้ Learning Rate เท่ากับ 0.5 ปรับค่า w1 จะได้

Update w1 = w1- Learning_Rate*Error_at_w1
= 0.2–0.5*(-0.003)
= 0.2015

ลองเปลี่ยนเป็นการใช้เป็นค่า w2 ดู จะได้

Implement with NumPy

นิยาม Neural Network Model ที่ไม่ใส่ Bias

กำหนดค่า X และ y

สร้าง nn1 แล้ว Print ค่าต่างๆ

nn1=NeuralNetwork(X,y)nn1.input.shapeprint(nn1.input)nn1.weights1.shapeprint(nn1.weights1)nn1.weights2.shapeprint(nn1.input)nn1.weights1.shape

จากในคาบเรียนได้นำเทคนิคดังกล่าวมาประยุกต์ใช้ ✌️

(งาน): ลองทำดู

ปรับ Learning Rate ตั้งแต่ 0.1, 0.2, …, 1.0 แต่ละค่า Train 1,000 Epoch
Plot Loss จากการปรับ learning_rate 10 ค่า รวมกัน
วิเคราะห์ผลการทดลอง

นิยาม Neural Network Class

def sigmoid(x):
    return 1.0/(1+ np.exp(-x))
def sigmoid_derivative(x):
    return x * (1.0 - x)
class NeuralNetwork:
    def __init__(self, x, y, l):
        self.input      = x
        self.weights1   = np.random.rand(self.input.shape[1],4) 
        self.weights2   = np.random.rand(4,1)                 
        self.y          = y
        self.output     = np.zeros(self.y.shape)
        self.learning_rate = l
        
    def loss1(self):
        return sum((self.y - self.output)**2)   def feedforward(self):
        self.layer1 = sigmoid(np.dot(self.input, self.weights1))
        self.output = sigmoid(np.dot(self.layer1, self.weights2))   def backprop(self):
        #learning_rate = 1.0
        
        d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output)))
        d_weights1 = np.dot(self.input.T,  (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1)))#self.weights1 += d_weights1
        #self.weights2 += d_weights2
        
        self.weights1 += self.learning_rate*d_weights1
        self.weights2 += self.learning_rate*d_weights2

1. ปรับ Learning Rate ตั้งแต่ 0.1, 0.2, …, 1.0 แต่ละค่า Train 1,000 Epoch

โดยมีขั้นตอนดังนี้

Train Model
Plot Loss
กราฟแสดงค่า loss

Train Model

→ Train Model เมื่อกำหนดค่า learning_rate = 0.1

learning_rate = 0.1
nn1 = NeuralNetwork(X,y,learning_rate)loss1=[]for i in range(1000):
 nn1.feedforward()
 nn1.backprop()
 loss1.append(nn1.loss1())

Plot Loss

→ Plot Loss ค่า 0.1 แทนด้วย df1 , h1

import pandas as pd
from matplotlib import pyplot as plt 
%matplotlib inlinedf1 = pd.DataFrame(loss1, columns=[‘loss1’])
df1.head()

ผลลัพธ์ที่ได้ นำไปเก็บไว้ในตัวแปร h1

plotly.offline.init_notebook_mode(connected=True)h1 = go.Scatter(y=df1[‘loss1’], 
 mode=”lines”, line=dict(
 width=2,
 color=’blue’),
 name=”loss1")
data = [h1]layout1 = go.Layout(title=’Loss 1',
 xaxis=dict(title=’epochs’),
 yaxis=dict(title=’’))
fig1 = go.Figure(data, layout=layout1)
plotly.offline.iplot(fig1)

→ Train Model เมื่อกำหนดค่า learning_rate = 0.2

learning_rate = 0.2
nn2 = NeuralNetwork(X,y,learning_rate)loss2=[]for i in range(1000):
 nn2.feedforward()
 nn2.backprop()
 loss2.append(nn2.loss2())

→ Plot Loss ค่า 0.2 แทนด้วย df2 , h2

df2 = pd.DataFrame(loss2, columns=[‘loss2’])
df2.head()

plotly.offline.init_notebook_mode(connected=True)h2 = go.Scatter(y=df2[‘loss2’], 
 mode=”lines”, line=dict(
 width=2,
 color=’red’),
 name=”loss2")
data = [h2]layout1 = go.Layout(title=’Loss 2',
 xaxis=dict(title=’epochs’),
 yaxis=dict(title=’’))
fig1 = go.Figure(data, layout=layout1)
plotly.offline.iplot(fig1)

ผลลัพธ์ที่ได้ นำไปเก็บไว้ในตัวแปร h2

จากข้างต้นเริ่มเปลี่ยนค่ามาเรื่อยๆตั้งแต่ 0.1,0.2, …. ,1.0

เมื่อทำเปลี่ยนค่าตามที่ต้องการจนมาถึงค่า learning rate = 1.0

→ Train Model เมื่อกำหนดค่า learning_rate = 1.0

→ Plot Loss ค่า 1.0 แทนด้วย df10 , h10

ผลลัพธ์ที่ได้ นำไปเก็บไว้ในตัวแปร h10

plotly.offline.init_notebook_mode(connected=True)h10 = go.Scatter(y=df10[‘loss10’], 
 mode=”lines”, line=dict(
 width=2,
 color=’black’),
 name=”loss10")data = [h10]layout1 = go.Layout(title=’Loss 10',
 xaxis=dict(title=’epochs’),
 yaxis=dict(title=’’))
fig1 = go.Figure(data, layout=layout1)
plotly.offline.iplot(fig1)

2. Plot Loss จากการปรับ learning_rate 10 ค่า รวมกัน

จากการแสดงผลลัพธ์ด้านบนเป็นการแสดงค่าของแต่ละ Learning Rate

ตั้งแต่ ตั้งแต่ 0.1, 0.2, …, 1.0 เมื่อเราต้องการดูผลรวมทั้งหมดในรูปแบบ Plot Loss จากการปรับ learning_rate 10 ค่า รวมกัน จะสามารถทำได้ดังนี้

plotly.offline.init_notebook_mode(connected=True)data = [h1,h2,h3,h4,h5,h6,h7,h8,h9,h10]layout1 = go.Layout(title=’Loss 0.1–1.0',
 xaxis=dict(title=’epochs’),
 yaxis=dict(title=’’))
fig1 = go.Figure(data, layout=layout1)
plotly.offline.iplot(fig1)

นำค่าตัวแปรที่เรากำหนดไว้มารวมใน data เดียวกัน ตั้งแต่ h1-h10

→ data = [h1,h2,h3,h4,h5,h6,h7,h8,h9,h10]

ดังภาพด้านล่าง 👇

ผลลัพธ์จะได้ตามที่ได้ต้องการ ….. 👉

→ 3. วิเคราะห์ผลการทดลอง 😃

จากแผนภาพโมเดลค่า plot loss จาก Learning_rate 10 ค่านี้

สรุปได้ว่าค่า learning_rate มีค่าที่มากขึ้นก็จะทำให้ค่า epochs ลดลงไปตามลำดับตามกราฟ ในทางกลับกันค่า learning_rate ยิ่งมีค่าที่น้อย(ลดลง)กลับทำให้ค่า epochs เพิ่มมากขึ้น

ขอจบการนำเสนอในหัวข้อนี้

หากผิดพลาดประการใดขออภัย มา ณ ที่นี้ด้วยนะคะ

วันนี้ขอลาไปก่อนแล้วมาพบกันใหม่ในสัปดาห์หน้า ขอบคุณค่ะ 🙂

แหล่งอ้างอิงจาก Reference🙏

Implement the Back-propagation Algorithm from Scratch with NumPy

บทความโดย อ.ดร.ณัฐโชติ พรหมฤทธิ์ ภาควิชาคอมพิวเตอร์ [https://www.cp.su.ac.th] คณะวิทยาศาสตร์ มหาวิทยาลัยศิลปากร…

blog.pjjop.org