[์ฐ๋จน Data Science] 1. Math, Numpy
๊ฐ„๋‹จํ•œ ์ˆ˜ํ•™ ์‹์„ Numpy๋กœ ๊ตฌํ˜„ํ•ด ๋ณด์ž 2021-06-30

Data Science And Math

์•ˆ๋…•ํ•˜์„ธ์š”? Justkode ์ž…๋‹ˆ๋‹ค. ๋งŽ์€ Machine Learning๊ณผ Deep Learning์˜ ๊ทผ๊ฐ„์€ ํ†ต๊ณ„ํ•™์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฏ€๋กœ, ์šฐ๋ฆฌ๋Š” ์ˆ˜ํ•™์„ ๋ฌด์‹œํ•˜๊ณ ๋Š” ๋จธ์‹ ๋Ÿฌ๋‹๊ณผ ๋”ฅ๋Ÿฌ๋‹์„ ์ดํ•ด ํ•  ์ˆ˜๋Š” ์—†์Šต๋‹ˆ๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์—, ์ผ๋‹จ ๊ณ ๋“ฑํ•™๊ต ๋•Œ ๋ฐฐ์› ๋˜ ํ™•๋ฅ ๊ณผ ํ†ต๊ณ„์™€ ๋Œ€ํ•™๊ต์—์„œ ๋ฐฐ์šฐ๋Š” ์„ ํ˜•๋Œ€์ˆ˜ ๋ฐ ํ™•๋ฅ ๊ณผ ๋žœ๋ค ๋ณ€์ˆ˜ ์ด๋ก ์„ ๋ณต์Šตํ•˜๋Š” ์‹œ๊ฐ„์„ ๊ฐ€์ ธ ๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ, ๋ณต์Šต์—๋งŒ ๋จธ๋ฌด๋Š” ๊ฒƒ์ด ์•„๋‹Œ, ์ด๋ฅผ Numpy๋กœ ์—ฐ์‚ฐ์„ ๊ตฌํ˜„ํ•˜๋Š” ๊ณผ์ • ๋˜ํ•œ ์ง„ํ–‰ ํ•ด ๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

Gaussian Distribution (์ •๊ทœ ๋ถ„ํฌ)

Gaussian Distribution์€ ๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉํ•˜๋Š” ๋ถ„ํฌ ๊ฐœ๋…์œผ๋กœ, ์‹คํ—˜์˜ ์ธก์ • ์˜ค์ฐจ๋‚˜ ์‚ฌํšŒ ํ˜„์ƒ ๋“ฑ ์ž์—ฐ๊ณ„์˜ ํ˜„์ƒ์€ ๋Œ€๋ถ€๋ถ„ Gaussian Distribution์„ ๋”ฐ๋ฅด๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

P(x)=1ฯƒ2ฯ€eโˆ’(xโˆ’ฮผ)22ฯƒ2P(x) = \frac{1}{{\sigma \sqrt {2\pi } }}e^{\frac{-(x-\mu)^2}{2 \sigma^2}}

Gaussian Distribution์„ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ๊ทผ์‚ฌ ํ•จ์ˆ˜
import numpy as np
import matplotlib.pyplot as plt

arr = np.array([1, 2, 3, 4, 5])  # np.array๋กœ List ๊ฐ์ฒด ๋ณ€ํ™˜
mean = arr.mean()  # ํ‰๊ท  ๊ตฌํ•˜๊ธฐ
std = arr.std()  # ํ‘œ์ค€ํŽธ์ฐจ ๊ตฌํ•˜๊ธฐ
vfunc = np.vectorize(lambda x: (x - mean) / std)  # vectorize method๋ฅผ ํ†ตํ•ด array๋ฅผ ๋ณ€ํ™˜ ์‹œ์ผœ์ฃผ๋Š” ํ•จ์ˆ˜
new_arr = vfunc(arr)  # ์ •๊ทœํ™” ๋œ array, ๊ฒฐ๊ณผ ๊ฐ’์œผ๋กœ๋Š” array([-1.41421356, -0.70710678,  0,  0.70710678,  1.41421356])
  • ๊ทธ๋ž˜ํ”„ ๊ตฌํ˜„
mean = 0
std = 1
variance = np.square(std)
x = np.arange(-3,3,.01)  # x์ถ• ๋ฐ์ดํ„ฐ: -3 ~ 3 ์‚ฌ์ด 0.01 ๊ฐ„๊ฒฉ [-3, -2.99 ... 2.99, 3]
f = np.exp(-np.square(x - mean)/ 2 * variance)/(np.sqrt(2 * np.pi * variance))  # y์ถ• ๋ฐ์ดํ„ฐ

plt.plot(x,f)  # x ์ขŒํ‘œ, y ์ขŒํ‘œ์— ํ•ด๋‹นํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ๊ทธ๋ž˜ํ”„ ํ‘œํ˜„
plt.show()

๊ฐ€์šฐ์‹œ์•ˆ ๊ทธ๋ž˜ํ”„.

Multivariate Gaussian Distribution (๋‹ค๋ณ€๋Ÿ‰ ์ •๊ทœ ๋ถ„ํฌ)

Multivariate Gaussian Distribution์€ ์ •๊ทœ๋ถ„ํฌ๋ฅผ ๋‹ค์ฐจ์› ๊ณต๊ฐ„์— ๋Œ€ํ•ด ํ™•์žฅํ•œ ๋ถ„ํฌ์ด๋ฉฐ, ์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋‹ค๋ณ€๋Ÿ‰ ๊ฐ€์šฐ์‹œ์•ˆ ์‹

๋‹ค๋ณ€๋Ÿ‰ ๊ฐ€์šฐ์‹œ์•ˆ ๊ทธ๋ž˜ํ”„

Eigenvalue, Eigenvector, Covariance Matrix (๊ณ ์œ ๊ฐ’, ๊ณ ์œ ๋ฒกํ„ฐ, ๊ณต๋ถ„์‚ฐ ํ–‰๋ ฌ)

์„ ํ˜• ๋Œ€์ˆ˜์—์„œ ์ฃผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฐœ๋…์ธ Eigenvalue, Eigenvector๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์‹์ด ์„ฑ๋ฆฝ ํ•ฉ๋‹ˆ๋‹ค.

์ž„์˜์˜ nร—nn \times n ํ–‰๋ ฌ AA ์— ๋Œ€ํ•˜์—ฌ, ์˜๋ฒกํ„ฐ๊ฐ€ ์•„๋‹Œ ๋ฒกํ„ฐ xโƒ—\vec{x} ๊ฐ€ ์กด์žฌ ํ•œ๋‹ค๋ฉด, ์ˆซ์ž ฮป\lambda ๋Š” ํ–‰๋ ฌ AA์˜ ๊ณ ์œณ๊ฐ’์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Axโƒ—=ฮปxโƒ—A\vec{x} = \lambda\vec{x}

์ด ๋•Œ, xโƒ—\vec{x}๋Š” ๊ณ ์œณ๊ฐ’ ฮป\lambda ์— ๋Œ€์‘ํ•˜๋Š” ๊ณ ์œ  ๋ฒกํ„ฐ์ž…๋‹ˆ๋‹ค.

์ด๋Š” ํ–‰๋ ฌ์˜ ๋ฒ•์น™์— ์˜ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‚˜ํƒ€ ๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

(Aโˆ’ฮป)xโƒ—=0(A - \lambda)\vec{x} = 0 det(Aโˆ’ฮปI)=0det(A - \lambda I) = 0

Eigenvalue, Eigenvector์˜ ๊ด€๊ณ„

์—ฌ๊ธฐ์„œ Eigenvalue๋Š” "์–ผ๋งˆ ๋งŒํผ ํฌ๊ธฐ๊ฐ€ ๋ณ€ํ–ˆ๋ƒ"๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Eigenvector๋Š” ํ–‰๋ ฌ์„ ์‚ฌ์˜ํ–ˆ์„ ๋•Œ "์›๋ž˜ ๋ฒกํ„ฐ์™€ ํ‰ํ–‰ํ•œ ๋ฒกํ„ฐ๋Š” ๋ฌด์—‡์ธ์ง€"๋ฅผ ์˜๋ฏธ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์—, ๊ฐ ํŠน์ง• ๊ฐ„์˜ ๊ณต๋ถ„์‚ฐ (feature์™€ feature ์‚ฌ์ด์˜ ์—ฐ๊ด€์„ฑ)์„ ๋‚˜ํƒ€๋‚ด๋Š” Covariance Matrix์— ๋Œ€ํ•ด ๋†’์€ Eigenvalue๋ฅผ ๊ฐ€์ง„ Eigenvector์€ ์ฃผ์„ฑ๋ถ„ ๋ถ„์„(PCA) ๋ฅผ ํ†ตํ•ด ํ•™์Šต ์‹œ ์ฐจ์›์„ ์ค„์ด๋Š” ๋ฐ ์‚ฌ์šฉ ๋ฉ๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์„œ ๋งํ•˜๋Š” Covariance Matrix์— ๋Œ€ํ•ด ์ถ”๊ฐ€ ์„ค๋ช…์„ ๋ง๋ถ™์ด์ž๋ฉด, ๋Œ€๊ฐ ์„ฑ๋ถ„์€ feature์˜ ๋ถ„์‚ฐ์„ ๋‚˜ํƒ€๋‚ด๊ณ , ๋‚˜๋จธ์ง€๋Š” ๊ฐ ํ–‰๊ณผ ์—ด์— ๋Œ€์‘ํ•˜๋Š” Covariance (๊ฐ feature์˜ ์„ ํ˜• ์ข…์†์„ฑ) ๋ฅผ ๋‚˜ํƒ€ ๋ƒ…๋‹ˆ๋‹ค.

Covariance Matrix

Eigenvalue, Eigenvector

array = [[1, 2], [3, 4]]
eigenvalues, eigenvectors = np.linalg.eig(array)  # Eigenvalue, Eigenvector of Covarience Matrix

print("eigenvalue: ", eigenvalues)
print("eigenvector: ", eigenvectors)
print("eigenvalue * eigenvector: ", eigenvalues * eigenvectors)
print("array * eigenvector: ", array @ eigenvectors)
eigenvalue:  [-0.37228132  5.37228132]
eigenvector:  [[-0.82456484 -0.41597356]
 [ 0.56576746 -0.90937671]]
eigenvalue * eigenvector:  [[ 0.30697009 -2.23472698]
 [-0.21062466 -4.88542751]]
array * eigenvector:  [[ 0.30697009 -2.23472698]
 [-0.21062466 -4.88542751]]

Eigen Vector, Eigen Value, PCA

array = np.random.multivariate_normal([0, 0], [[1, 0], [0, 100]], 1000)  # multivariate (mean, covarience matrix, samples)
cov = np.cov(array.T)  # Covarience Matrix ((index x, index y) x์— ๋Œ€ํ•ด, y๊ฐ€ ์–ผ๋งˆ๋‚˜ ๋ณ€ํ™” ํ•˜๋Š”๊ฐ€?)
print(cov)
eigenvalues, eigenvectors = np.linalg.eig(cov)  # Eigenvalue, Eigenvector of Covarience Matrix

print("eigenvalue: ", eigenvalues)
print("eigenvector: ", eigenvectors)

plt.scatter(array[:, 0], array[:, 1])
plt.quiver([0, 0], [0, 0], eigenvectors[:, 0], eigenvectors[:, 1], color=['r', 'b'], scale=3)
plt.xlim(-50, 50)
plt.ylim(-50, 50)
plt.show()
[[ 1.02530381  0.23267569]
 [ 0.23267569 97.82469968]]
eigenvalue:  [ 1.02474454 97.82525895]
eigenvector:  [[-0.99999711 -0.00240367]
 [ 0.00240367 -0.99999711]]

์ฃผ์„ฑ๋ถ„์€ ํŒŒ๋ž€์ƒ‰ ๋ฒกํ„ฐ์— ๊ฐ€๊นŒ์šด๊ฑธ, ์ˆ˜์น˜์ ์œผ๋กœ, ์‹œ๊ฐ์ ์œผ๋กœ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

Differentiation in computer

Machine Learning์—์„œ ๋ฏธ๋ถ„์€ ์–ด๋–ค ๊ฒƒ์„ ์˜๋ฏธ ํ• ๊นŒ์š”? ์ด๋Š” Cost Function์— ๋Œ€ํ•œ Gradient descent (๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•)์„ ์‚ฌ์šฉ ํ•  ๋•Œ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ปดํ“จํ„ฐ์—์„œ ๋ฏธ๋ถ„์€ ์ˆ˜์‹์„ ์‚ฌ์šฉํ•˜๋Š” ํ•ด์„์  ๋ฐฉ๋ฒ•์ด ์•„๋‹Œ, ์ง์ ‘ ๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ ์ง„ํ–‰ํ•˜๋Š” ์ˆ˜์น˜์  ๋ฏธ๋ถ„์„ ์ง„ํ–‰ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ๊ตฌํ˜„ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

def diff(f, index, *args):  # index ๋ฒˆ์งธ์˜ parameter์— ๋Œ€ํ•ด ํŽธ๋ฏธ๋ถ„
    delta = 1e-4  # h
    new_args = list(args)  # index ๋ฒˆ์งธ์˜ parameter์— ๋Œ€ํ•ด delta๋ฅผ ๋”ํ•ด ์คŒ
    new_args[index] += delta
    return (f(*new_args) - f(*args)) / delta  # f(x + h, y) - f(x, y) / h

diff(lambda x, y: x ** 2 + y ** 2, 0, 3, 3)  # f(x, y) = x^2 + y^2
6.000099999994291

์ด๋ฅผ ๊ทธ๋ž˜ํ”„๋กœ ํ™•์ธ ํ•ด ๋ณด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

# ์›๋ž˜ ๊ทธ๋ž˜ํ”„
x = np.arange(-3,3,.01)
y = x ** 2

# tangent, ์ ‘์„ 
d = diff(lambda x: x ** 2, 0, 1)
x_tangent = np.arange(0, 2, .01)
y_tangent = (x_tangent - 1) * d + 1

# ๊ทธ๋ž˜ํ”„ ํ‘œํ˜„
plt.plot(x, y)
plt.plot(x_tangent, y_tangent, color='r')
plt.show()

๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„ ๋œ ๋ชจ์Šต.

Computational Graph

Cost Function์— ๋Œ€ํ•œ Gradient descent (๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•)์„ ์‚ฌ์šฉ ํ•  ๋•Œ, ๋งค๊ฐœ๋ณ€์ˆ˜์— ์ƒ๋Œ€์ ์ธ ๋ฏธ๋ถ„ ๊ฐ’์„ ์ „๋‹ฌํ•˜๊ธฐ ์œ„ํ•˜๊ธฐ ์œ„ํ•ด์„  ์–ด๋–ค ๊ฒƒ์ด ํ•„์š” ํ• ๊นŒ์š”? ์ด๋Ÿด ๋•Œ๋Š” ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

ํŠน์ • ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ๋ณ€ํ™” ํ–ˆ์„ ๋•Œ, ๊ฒฐ๊ณผ ๊ฐ’์ด ์–ผ๋งˆ๋งŒํผ ์›€์ง์ด๋Š” ๊ฐ€?

๊ณฑ์…ˆ ๋…ธ๋“œ์˜ ๊ตฌํ˜„, ๋ง์…ˆ ๋…ธ๋“œ์˜ ๊ตฌํ˜„ ๋ฐ ์ˆœ์ „ํŒŒ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

class MulLayer:  # ๊ณฑ์…ˆ ๋…ธ๋“œ
    def forward(self, x, y):
        self.x = x
        self.y = y
        out = x * y
        
        return out
    
    def backward(self, dout):
        dx = dout * self.y
        dy = dout * self.x
        
        return dx, dy


class AddLayer:  # ๋ง์…ˆ ๋…ธ๋“œ
    def forward(self, x, y):
        out = x + y
        return out
    
    def backward(self, dout):
        dx = dout * 1
        dy = dout * 1
        return dx, dy

chicken = 17000
num = 3
delivary_fee = 3000

mul_chicken = MulLayer()
add_fee = AddLayer()

chicken_price = mul_chicken.forward(chicken, num)
total_price = add_fee.forward(chicken_price, delivary_fee)
print(total_price)
54000

์—ญ์ „ํŒŒ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

d_price = 1
d_chicken_price, d_delivary_fee = add_fee.backward(d_price)
d_chicken, d_num = mul_chicken.backward(d_chicken_price)

print("d_chicken_price: ", d_chicken_price)
print("d_delivary_fee: ", d_delivary_fee)
print("d_chicken: ", d_chicken)
print("d_num: ", d_num)
d_chicken_price:  1
d_delivary_fee:  1
d_chicken:  3
d_num:  17000

๋งˆ์น˜๋ฉฐ

์˜ค๋Š˜์€ ์ด๋ ‡๊ฒŒ ๋”ฅ๋Ÿฌ๋‹์—์„œ ์‚ฌ์šฉ ๋˜๋Š” ์ˆ˜ํ•™ ๊ณต์‹๋“ค์„ numpy๋ฅผ ์ด์šฉ ํ•˜์—ฌ ๊ตฌํ•ด ๋ณด์•˜์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ ์‹œ๊ฐ„์—๋Š” Pandas์˜ ์‚ฌ์šฉ ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.