1-5 Sklearn SVC(Support Vector Classification) XOR 예제 TensorFlow 코딩

머신러닝

1-5 Sklearn SVC(Support Vector Classification) XOR 예제 TensorFlow 코딩

coding art 2020. 1. 9. 18:04

728x90

XOR 로직 문제는 1969년 MIT AI 랩 교수였던 Minsky 교수의 공저인 Perceptron 저서에서 하나의 레이어만 가지고는 Rosenblatt의 퍼셉트론에 의한 머신러닝이 불가능하다고 지적했던 고전적인 문제로서 이미 뉴럴 네트워크(Neural Network)의 출현을 암시했었다. 물론 지금이야 누구나 다 Backpropagation 알고리듬을 알고 사용하지만 그때 당시로는 개념조차 없었기 때문에 랜덤한 웨이트 값에서 시작하는 경사 하강법에 의한 학습(training) 알고리듬을 상상 조차 할 수 없던 시대였다.

가장 간단한 XOR 문제는 위 그림에서 처럼 평면상에서 4개의 점을 사용하여 서로 대각선 방향의 점들을 같은 클라스로 설정하고 학습해야 하는 알고리듬 문제로서 선형적 분리(linear separation)가 불가능한 즉 하나의 hyperplane을 결정할 수 없는 dilemma 성격의 문제로 지적되었다.

이러한 비선형성의 XOR 로직문제는 SVM(Support Vector Machine) kernel을 사용하는 Sklearn 라이브러리의 SVC를 사용하여 해결이 가능하다.

Support Vector Machine 알고리듬으로 유명한 SVC는 러시아의 수학자였던 Vladmir Vapnik(1936∼)이 1963년에 발명하였으며 1990년 이후에 미국으로 이주 후 LeCUN 이 몸 담았던 Bell Lab에서 활동하다가 최근에는 LeCUN 과 함께페이스북 AI 랩으로 옮겨 활동하고 있는 머신 러닝 분야에서 전설적인 인물이다. 외부적으로는 LeCUN이 대표적인 AI 의 리더로 알려져 있지만 머신 러닝 예제로 너무나 유명한 MNIST 수기문자 판독 분야에서 거의 100%에 근접한 99.5%를 넘어서게 된 배경은 워낙 그 사용법이 까다롭기는 하지만 Classification 기법으로는 최고의 정확도를 주는 것으로 알려진 SVC 알고리듬 덕으로 보면 될 것이다.

따라서 가장 간단한 위 그림의 XOR 문제를 SVC에 의한 학습 결과를 얻어 봄과 동시에 한편 이 예제를 보다 쉽게 접근할 수 있도록 TensorFlow Classification 문제로 코딩하되 필자의 저서 “파이선 코딩 초보자를 위한 텐서플로우 OpenCV 머신러닝” 3장과 4장에서 찾아 볼 수 있는 비선형 형태의 다항식 기법을 적용하여 그 결과를 비교해 보기로 한다.

이 XOR 문제를 TensorFlow 코딩화 하여 학습을 시키고 학습 결과를 사용하여 Sklean에서처럼 그래프 영역을 Meshgrid 화하여 컬러 처리함에 있어서 plot_decision_regions 함수 루틴 코드 내부를 직접 손보았으며 tf_plot_decision_regions 로 함수 이름을 명명하여 사용하였다.

헤더 영역에 다음과 같이 라이브러리 모듈들을 불러들이자.

TensorFlow로 XOR 문제를 처리하기 위한 입력 데이터 X_xor은 다음과 같이 근 점한 점을 설정하여 각각 1사분면에서 4사분면까지 위치하도록 조정하였다.

svc 결과를 얻어내기 위한 파라메터는 γ=1.0, C=100.0 으로 설정하였다. 이들 파라메터 설정에 따라서 Classification을 위한 hyperplane 이 상당히 복잡한 형태로 나타나나 다항식 기법을 사용할 경우에는 2개의 직선 형태로 Classification 이 일어남을 볼 수 있다.

Classification을 위한 최종 클라스 정보는 “0” 또는 “1” 즉 1비트로 분류 처리해야 하므로 TensorFlow에서 활성화 함수(activation function)는 Sigmoid를 사용하고 cost 함수는 cross entropy를 사용하도록 한다. 아울러 “0” 과 “1”로 클라스를 분류하기 위한 hypothesis 의 threshold 는 0.5 로 설정한다.

SVC에서는 내부적으로 어떤 Optimizer를 사용했는지 알수 없으나 TensorFlow 에서는 두 가지 즉 GradientDescent 와 Adam을 사용했다. 위 그림에서처럼 결과 그래픽은 달라 보이지만 Classification 차원에는 동일하다고 볼 수 있다.

Adam Optimizer를 사용할 경우의 default learning_rate 값은 0.001이지만 여기서는 learning_rate = 0.00001을 사용하자. GradientDescent Optimizer를 사용할 경우에는 learning_rate = 0.001 에 유의하자. 학습 횟수 training_epochs 는 사용자가 임의로 설정하면 된다.

TensorFlow 코드에서 4개의 점 데이터를 읽어 들이기 위한 placeholder 지정 시에 shape을 [None, 2] 로 지정해두면 설사 데이터 수가 변하드라도 아무런 문제가 없으며 dof1=1 은 1비트 데이터 “0” 과 “1”을 읽어 처리할 수 있다.

비선형 형태의 hypothesis 적용이 가능하도록 다음과 같이 한 쌍의 랜덤 웨이트와 바이아스들을 정의한다. hypothesis 는 마지막 단계에서 그래픽 처리과정에 불러 쓰기 편하도록 함수형태로 정의해 헤더 영역에 두기로 한다.

Session 영역에서 학습 후 웨이트와 바이아스가 결정이 되면 다시 입력 데이터를 테스트하여 그결과를 tf_plot_decision_regions 루틴에 제공하여 그래픽 처리를 한다.

현재 작성된 TensorFlow 코드에서 pts 값을 사용자가 원하는 값을 부여하고 검은색 박스 친 #∙∙∙를 살려 실향하면 랜덤 넘버로 이루어지는 XOR 문제의 Classification 결과를 볼 수 있다.

다음의 결과는 pts 값이 200인 경우 SVC 와 GradientDescent 다항식 기법에 의한 결과이다. 아무래도 다항식 기법에서는 아직까지는 Maximum Margin을 처리하기 위한 알고리듬이 배제되어 있기 때문에 Support Vector들의 Classification 결과 처리에 약간의 초차가 있음을 알 수 있다. 하지만 SVM 기법에서처럼 고난도의 수학적 처리 과정을 고민할 필요는 없을 것이다.

본 내용은 수년간 블로그에 게재했던 내용 중에 난이도가 대단히 높은 편이지만 그래도 읽어 보시고 머신 러닝 분야의 연구에 참조가 되었으면 합니다.

#XOR_SVM_SIGMOID_01.PY

from sklearn import __version__ as sklearn_version
from sklearn import datasets
import numpy as np
from matplotlib.colors import ListedColormap
import matplotlib.pyplot as plt
from sklearn.svm import SVC

def plot_decision_regions(X, y, classifier, test_idx=None, resolution=0.01):

    # setup marker generator and color map
    markers = ('s', 'x', 'o', '^', 'v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])

    # plot the decision surface
    x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),
                           np.arange(x2_min, x2_max, resolution))
    #print('xx1,xx2=',xx1,xx2)
    P=np.array([xx1.ravel(), xx2.ravel()])
    print('P=',P.T)
    print(classifier)
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    print('Z=',Z)
    print('Z.shape=',Z.shape)
    Z = Z.reshape(xx1.shape)
    #print('Z.reshape=',Z.reshape)
    plt.contourf(xx1, xx2, Z, alpha=0.3, cmap=cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())

    for idx, cl in enumerate(np.unique(y)):
        plt.scatter(x=X[y == cl, 0],
                    y=X[y == cl, 1],
                    alpha=0.8,
                    c=colors[idx],
                    marker=markers[idx],
                    label=cl,
                    edgecolor='black')

    # highlight test samples
    if test_idx:
        # plot all samples
        X_test, y_test = X[test_idx, :], y[test_idx]

        plt.scatter(X_test[:, 0],
                    X_test[:, 1],
                    c='',
                    edgecolor='black',
                    alpha=1.0,
                    linewidth=1,
                    marker='o',
                    s=100,
                    label='test set')

# **Note**
np.random.seed(3)

pts = 4 # total number of random points
#X_xor = np.random.randn(pts, 2)
X_xor = np.array([[ 1., -0.01 ],[-0.01, -0.01],[ 1.0, 1.0 ],[ -0.01, 1.0]])
print(X_xor)
#y_xor = np.logical_xor(X_xor[:, 0] > 0, X_xor[:, 1] > 0)
y_xor = np.array([1, 0, 0, 1])
print(y_xor)
y_xor = np.where(y_xor, 1, 0)
print(y_xor)

print(X_xor.shape)
print(y_xor.shape)

plt.scatter(X_xor[y_xor == 1, 0], X_xor[y_xor == 1, 1],
c='b', marker='x', label='1')
plt.scatter(X_xor[y_xor == 0, 0], X_xor[y_xor == 0, 1],
c='r', marker='s', label='0')

plt.xlim([-3, 3])
plt.ylim([-3, 3])
plt.legend(loc='best')
plt.tight_layout()
plt.show()

#Support Vector Machine "RBF"
svm = SVC(kernel='rbf', random_state=1, gamma=1.0, C=100.0)
svm.fit(X_xor, y_xor)
plot_decision_regions(X_xor, y_xor,classifier=svm)

plt.legend(loc='upper left')
plt.tight_layout()
plt.show()

#Following is method to convert numpy array to tensor
import tensorflow as tf
import time

start_time = time.time()

def fn(X,W1,b1,W2,b2):
hypothesis = tf.sigmoid((tf.matmul(X, W2) + b2)*(tf.matmul(X, W1) + b1))
return hypothesis

def tf_plot_decision_regions(X_xor, y, hypothesis, predicted, test_idx=None, resolution=0.02):


    # setup marker generator and color map
    markers = ('s', 'x', 'o', '^', 'v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])
    # plot the decision surface
    x1_min, x1_max = X_xor[:, 0].min() - 1, X_xor[:, 0].max() + 1
    x2_min, x2_max = X_xor[:, 1].min() - 1, X_xor[:, 1].max() + 1
    xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),
                           np.arange(x2_min, x2_max, resolution))
    XX = np.array([xx1.ravel(), xx2.ravel()]).T
    print(XX.shape)
    #Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    #Z = Z.reshape(xx1.shape)
    #h, p = sess.run([hypothesis, predicted], feed_dict={X: (np.array([xx1.ravel(), xx2.ravel()]).T) })
    h, p = sess.run([hypothesis, predicted], feed_dict={X: XX })

    p = p.reshape(xx1.shape)
    plt.contourf(xx1, xx2, p, alpha=0.3, cmap=cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())

    for idx, cl in enumerate(np.unique(y)):
        plt.scatter(x=X_xor[y == cl, 0],
                    y=X_xor[y == cl, 1],
                    alpha=0.8,
                    c=colors[idx],
                    marker=markers[idx],
                    label=cl,
                    edgecolor='black')

    # highlight test samples
    if test_idx:
        # plot all samples
        X_test, y_test = X_xor[test_idx, :], y[test_idx]

#Training Data
Y_xor = np.zeros([pts,1])

Y_xor[y_xor == 0] = [0.]
Y_xor[y_xor == 1] = [1.]
print(Y_xor)

X_xor = np.float32(X_xor)
Y_xor = np.float32(Y_xor)
#print(Y_xor)

print(X_xor.shape)
print(Y_xor.shape)

#hyperparameter
learning_rate = 0.00001
training_epochs = 20000
display_steps = 1000
#Network parameters
n_input = 2
dof1 = 1
#Graph Nodes
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, dof1])
print(X)
print(Y)

#Weights and Biases, model, loss and optimizer
W1 = tf.Variable(tf.random_normal([n_input, dof1], stddev=0.01))
b1 = tf.Variable(tf.random_normal([dof1], stddev=0.01))
W2 = tf.Variable(tf.random_normal([n_input, dof1], stddev=0.01))
b2 = tf.Variable(tf.random_normal([dof1], stddev=0.01))

#hypothesis = (tf.matmul(X, W2) + b2)*(tf.matmul(X, W1) + b1)
hypothesis = fn(X,W1,b1,W2,b2)

# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis))
#optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(cost)
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)

predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(hypothesis,1),
tf.argmax(Y,1)), dtype=tf.float32))

#Initializing global variables
init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    for epoch in range(training_epochs):
        _, c, w1, B1, w2, B2 = sess.run([optimizer, cost, W1, b1, W2, b2], feed_dict={X: X_xor, Y: Y_xor})
        if(epoch + 1) % display_steps == 0:
            print( "Epoch: ", (epoch+1), "Cost: ", c, w1, B1, w2, B2 )
    print("Optimization Finished!")

    # Accuracy report
    h, p, a = sess.run([hypothesis, predicted, accuracy],
                       feed_dict={X: X_xor, Y: Y_xor})
    print("\nHypothesis:\n ", h, "\nCorrect:\n ", p, "\nAccuracy: ", a)
    print(p.shape)
    print(p)
    tf_plot_decision_regions(X_xor, y_xor, hypothesis, predicted)
    sess.close()

end_time = time.time()
print( "Completed in ", end_time - start_time , " seconds")

저작자표시 비영리 변경금지 (새창열림)

'머신러닝' 카테고리의 다른 글

1-7 Scikit SVC vs TensorFlow Softmax Classification For Iris Flowers Dataset (0)	2020.01.12
1-6 활성화 함수 Softmax에 의한 XOR 예제 TensorFlow 코딩 (0)	2020.01.10
1-4 Iris flowers 데이터 Scikit k-means clustering 비지도학습 (0)	2020.01.04
1-3 Iris flowers 데이터 Scikit k-means clustering 비지도학습 (0)	2020.01.04
1-2 Scikit-learn 에 의한 k-means++ clustering 비지도학습 (0)	2020.01.03

현재글1-5 Sklearn SVC(Support Vector Classification) XOR 예제 TensorFlow 코딩

Machine Learning , AI, Arduino Coding

후 실행,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Machine Learning , AI, Arduino Coding