2017-04-11

Machine Learning (1)

Supervised

給定一組 Data set

Classification 分類問題(連續)

Regression 遞歸問題(分類)

UnSupervised

所有數據都是一樣的，需要根據給定數據找出規則或分類

雞尾酒問題

假設宴會上有很多人，現在給 2 人 1 人 1 隻麥克風講話

此 2 個麥克風都會錄到 2 人的聲音

要如何分離 2 人聲音或背景音樂?

[W,s,v] = svd(( repmat(sum(x.*x,1),size(x,1),1).*x)*x');

SVD = singular value decomposition (奇異值分解)

Cocktail Party

學習 ML 使用 Octave (free) 或 Matlab 練習

先用 Octave 做出原型，在用其他語言實現

因為 Octave 以內建許多演算法函式

其他

3 element vector or a 3 dimensional vector

1
2
3

      element 1
A = [ element 2 ]
      element 3

練習

Model

Regression Problem = 根據之前的數據，預測出一個準確值

Training set = 之前蒐集的數據

面積(x)   價錢(y)
2104       460
1416       232
1534       315
852        178
...        ...

m = 幾項樣本數目，假設有 47 樣本數就為 47
x = input  = input features
y = output = target variable

(x,y)                     = one training example
(x^(i) , y^(i))           = 第 i 項樣本
(x^(i) , y^(i)) i=1,....m = training set


Ex:
x^(2) = 1416
y^(3) = 315

流程: Training Set -> Learning Algo -> h(Hypothesis) 表示函數

面積(x) -> h -> 價錢(y)

1	h maps from x to y

h 可以寫成

hθ(x) = θ0 + θ1X (線性函數)
= 線性回歸(linear regression) model
= 根據 x (變量) 來預測價格
= 單變量線性回歸

Cost Function

代價函數，或稱為 squared error function

Training Set:
面積(x)   價錢(y)
2104       460
1416       232
1534       315
852        178
...        ...

Hypothesis : hθ(x) = θ0 + θ1X 
用來預測的函數

θi = Parameters = Parameters of model
問題在於如何決定 θ0 或 θ1

θ0 和 θ1 不同會造成線的斜率不同
所以我們要找的是，再給定的 Training Set 之下
θ0 和 θ1 要取多少才會經過最多的 Training Set


所以根據此想法，我們要訂一個規則:

1. minimize θ0,θ1
2. (hθ(x) - y)^2 要小 (表示越準確)


Hypothesis:
h(x) = θ0 + θ1X

Parameters:
θ0 , θ1

Cost Function:
                 m
J(θ0,θ1) = 1/2m  ∑ ( h(x^(i) ) - y^(i) )^2
                i=1 
                
Goal: minimize J(θ0,θ1)
        θ0,θ1
        
   

Simplified: 將 θ0 = 0

                 m
J(θ1) = 1/2m  ∑ ( h(x^(i) ) - y^(i) )^2
                i=1 
                
Goal: minimize J(θ1)
         θ1

簡單的說，就是要找出 θ0 , θ1，並且使 J(θ0,θ1) 最小

使用梯度下降演算法 ( gradient descent )

此演算法會持續變化 θ0 , θ1 來找出 min J(θ0,θ1)

JasonChiuCC

You’ve got to find what you love

Machine Learning (1)

Supervised

UnSupervised

其他

Model

Cost Function

留言