Computer ๐Ÿ’ป/Deep Learning

[cs231n] 2. Image Classification - NN, KNN, Linear Classification

yeon42 2022. 1. 10. 15:18
728x90

Image Classification

์ปดํ“จํ„ฐ๋Š” ์‚ฌ๋žŒ์ฒ˜๋Ÿผ ์ด๋ฏธ์ง€ ์ž์ฒด๋ฅผ ์ธ์‹ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ, ์ˆซ์ž, ์ฆ‰ ํ”ฝ์…€ ๊ฐ’์œผ๋กœ ์ธ์‹ํ•œ๋‹ค !

๋‹ค์–‘ํ•œ ์ด๋ฏธ์ง€๋“ค์˜ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•ด์„  ์ˆ˜๋งŽ์€ ๋ณ€ํ™”์— ๋Œ€ํ•ด์„œ robustํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ํ•„์š”ํ•˜๋‹ค.

 

 

Data-driven Image Classification (๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜)

๊ธฐ์กด์˜ rule-based appraoch๋Š” ๋‹ค์–‘ํ•œ ํด๋ž˜์Šค๋“ค์— ์ ์šฉ์ด ๋ถˆ๊ฐ€๋Šฅํ–ˆ๋‹ค.

ex) ๊ณ ์–‘์ด๋Š” ๊ท€, ๋ˆˆ, ์ฝ”๊ฐ€ '์กด์žฌํ•œ๋‹ค' -> ์ด๋ฅผ ํ†ตํ•ด ๊ณ ์–‘์ด์ž„์„ ํŒ๋‹จ

    but, ๊ณ ์–‘์ด๋„ ์ข…๋ฅ˜์— ๋”ฐ๋ผ ๋‹ค๋ฅด๋‹ค.

 

 

์‚ฌ๋žŒ์ด ์ง์ ‘ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด ์•„๋‹Œ, ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์ž!

์ถœ์ฒ˜: https://3months.tistory.com/512

1. Collect a (large numbers of) dataset of images and labels

2. Use Machine Learning to train a classifier

-> ๋‹ค์–‘ํ•œ ๊ฐ์ฒด ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ ๋งŒ๋“ค๊ธฐ

3. Evaluate the classifier on new images

 

 

 

NN (Nearest Neighborhood)

- train data๋ฅผ ์ €์žฅํ•œ ํ›„, ์˜ˆ์ธก ๋‹จ๊ณ„์—์„œ ํˆฌ์ž…๋œ ์ด๋ฏธ์ง€์™€ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ๋ฐ์ดํ„ฐ์˜ ๋ ˆ์ด๋ธ”์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฐฉ๋ฒ•

1. memorize all data and labels

2. predict the (new images') label to find the most similar training image

 

์˜ค๋ฅธ์ชฝ์„ ๋ณด๋ฉด train image ์ค‘ test image์™€ ์œ ์‚ฌํ•œ ์ˆœ์œผ๋กœ ์ •๋ ฌ๋œ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

 

 

 

* ๊ทธ๋ ‡๋‹ค๋ฉด, ์ด๋ฏธ์ง€์™€ ์ด๋ฏธ์ง€์˜ ๊ฐ€๊นŒ์šด ์ •๋„(distance)๋Š” ์–ด๋–ป๊ฒŒ ๊ตฌํ•  ์ˆ˜ ์žˆ์„๊นŒ?

 

(+) ๊ฑฐ๋ฆฌ ๊ณ„์‚ฐ์€ ๋™์ผํ•œ ์œ„์น˜ ๊ฐ’์—์„œ rgb๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” pixel๊ฐ’์œผ๋กœ

 

# L1 ์˜ˆ์‹œ
import numpy as np

class NearestNeighbor:
    def __init__(self):
        pass
    
    # ๋‹จ์ง€ Memorize training data (simple)
    def train(self, X, y):
        """ X is N x D where each row is an example.
            Y is 1-dim of size N """
        # the nearest neightbor classifier simply remembers all the training data
        self.Xtr = X
        self.ytr = y

    def predict(self, X):
        """ X is N x D where each row is an example we wish to predict label for """
        num_test = X.shape[0]
        # lets make sure that the output type matches the input type
        Ypred = np.zeros(num_test, dtype=self.ytr.dtype)
        
        # loop over all test rows
        for i in xrange(num_test):
            # find the nearest training image to the i'th test image
            # using the L1-distance (sum of absolute value differences)
            distances = np.sum(np.abs(self.Xtr - X[i, :]), axis=1)
            min_index = np.argmin(distances) # get the Index with the smallest distance
            Ypred[i] = self.ytr[min_index] # predict the label of teh nearest example
            
        return Ypred

 

 

KNN (K-nearest neighbor)

- NN์€ ๋‹จ ํ•˜๋‚˜์˜ label์—์„œ๋งŒ prediction์„ ๊ณ ๋ คํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์„ฑ๋Šฅ์ด ๋–จ์–ด์ง„๋‹ค.

ํ…Œ์ŠคํŠธ(์˜ˆ์ธก) ๋‹จ๊ณ„์—์„œ input๊ณผ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ด์›ƒ์„ ์ด k๊ฐœ ์ฐพ๊ณ , ๊ฐ€์žฅ ๋นˆ๋ฒˆํ•˜๊ฒŒ ๋‚˜์˜จ ๋ ˆ์ด๋ธ”๋กœ ์˜ˆ์ธกํ•˜๋Š” ๋ฐฉ๋ฒ•

(voting: ์ด๋ ‡๊ฒŒ ์—ฌ๋Ÿฌ ๊ฐœ๋กœ๋ถ€ํ„ฐ ๊ฐ€์žฅ ๋นˆ๋ฒˆํžˆ ๋‚˜์˜จ ๊ฒƒ์„ (ํˆฌํ‘œ) ์˜ˆ์ธก ๊ฒฐ๊ณผ๋กœ ํ•˜๋Š” ์ž‘์—…)

 

KNN์€ ์ด์ƒ์น˜์— ๋” ๋‘”๊ฐํ•˜๋‹ค.

unseen data์— ๋Œ€ํ•œ ์„ฑ๋Šฅ(generalization)์ด ๋” ๋†’๋‹ค!

(๋จธ์‹ ๋Ÿฌ๋‹์˜ ๋ชฉ์ : unseen data๋ฅผ ์˜ฌ๋ฐœ๊ฒŒ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ)

 

 

 

 

* ์ตœ์ ์˜ ๋ชจ๋ธ์„ ์ฐพ๊ธฐ ์œ„ํ•ด์„ ?

ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ (hyperparameter)์„ ์กฐ์ •ํ•˜์ž!

KNN์—์„œ๋Š” k์™€ distance function

 

- ์—ฌ๋Ÿฌ ๋ฒˆ ๋ฐ˜๋ณตํ•˜๋ฉฐ ๊ฐ€์žฅ ์ข‹์€ ๊ฒƒ ์ฐพ๊ธฐ

but, hyperparameter๋ฅผ ์ฐพ๋Š”๋ฐ test set์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Œ

- test set์€ ์„ฑ๋Šฅ ํ‰๊ฐ€ํ•˜๋Š” ๋ถ€๋ถ„์—์„œ๋งŒ ์‚ฌ์šฉ๋˜์–ด์•ผ ํ•จ!

- training set์˜ ์ผ๋ถ€๋ฅผ hyperparameter tuning์„ ์œ„ํ•ด ๋ถ„ํ• ํ•˜์—ฌ validation set์„ ์ƒ์„œํ•œ๋‹ค.

 

- NN์€ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์— ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด์ง€ ์•Š๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค.

Why?

1. prediction is very slow

2. ํ”ฝ์…€๋“ค ๊ฐ„์˜ difference๊ฐ€ ์ด๋ฏธ์ง€์˜ ์œ ์‚ฌ์„ฑ(or ์ฐจ์ด)๋ฅผ ์ž˜ ๋‚˜ํƒ€๋‚ด์ง€ ๋ชปํ•จ

3. Curse of dimensionality (์ฐจ์›์˜ ์ €์ฃผ)

    - ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ž˜ ๋™์ž‘ํ•˜๊ฒŒ ํ•˜๋ ค๋ฉด ๊ธฐํ•˜ ๊ธ‰์ˆ˜์ (์ฐจ์›์˜ ์ œ๊ณฑ)์ธ ์–‘์˜ training data๊ฐ€ ํ•„์š”ํ•œ๋ฐ,

      ์‹ค์ œ ์ด๋Š” ๋ถˆ๊ฐ€๋Šฅํ•˜๊ณ  ์ง€์ˆ˜์Šน์˜ data์˜ ์„ฑ๋Šฅ์€ ๋งค์šฐ ๋–จ์–ด์ง

 

 

 

 

Linear Classifier

Image Classification์— ์žˆ์–ด KNN๋ณด๋‹ค ์ข€ ๋” ๊ฐ•๋ ฅํ•œ ์ ‘๊ทผ๋ฒ•

 

Parametric Approach

 

f(x, W) = Wx + b

- f์˜ ์ž…๋ ฅ๊ฐ’: x, W

- x: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ

- W: ํŒŒ๋ผ๋ฏธํ„ฐ

- b(bias): unseen data๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•ด ๋„์›€์„ ์คŒ

   - ๋ฐ์ดํ„ฐ์™€ ๋…๋ฆฝ์ ์œผ๋กœ ์กด์žฌ -> input data์— ๋…๋ฆฝ์ ์œผ๋กœ ๋งŒ๋“ฌ

 

 

๊ฐ€์žฅ simpleํ•œ ๋ฐฉ๋ฒ•์€ x์™€ W๋ฅผ ๊ณฑํ•ด์ฃผ๋Š” ๊ฒƒ

- ํ–‰๋ ฌ ๊ณฑ์…ˆ (๋‚ด์ )

 

 

์•ž์œผ๋กœ W(ํŒŒ๋ผ๋ฏธํ„ฐ)์˜ ๊ฐ’์„ ๊ฐœ์„ ์‹œ์ผœ๋‚˜๊ฐ€๋ฉฐ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ๋งŒ๋“ค๋„๋ก ํ•  ๊ฒƒ์ด๋‹ค.

 

 

 

 

 

Reference