삽질 중: 7월 2018

2018년 7월 26일 목요일

Deep learsing note, links

배운것 기억나는 선에서 대충 간략히 정리함.
틀릴 수 있음. cross check 필요

http://charlie0301.blogspot.com/2018/07/ai-machine-learning-links.html

Deep learning
: https://en.wikipedia.org/wiki/Deep_learning

Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, semi-supervised or unsupervised.

Single-Layer Perceptron (단층 퍼셉트론)
: https://en.wikipedia.org/wiki/Perceptron

: 뉴런을 따라 만든 알고리즘 하나의 단위, 여러 input에 대해 output을 출력

: 계산은 연결된 이전 퍼셉트론과의 Linear Combination + Activation function

: Linear Combination - 이전 Layer의 연결된 뉴런들의 출력값 * 연결 weight

: Activation - 이전 뉴런 들과의 Linear Combination 값을 non-linear function을 통해 Activation

- Activation function들 : Sigmoid, Hyperbolic Tangent (tanh), Rectified Linear Unit (ReLU)
https://en.wikipedia.org/wiki/Activation_function

Logistic (a.k.a. Sigmoid or Soft step)		$f(x)=\sigma (x)={\frac {1}{1+e^{-x}}}$ ^[1]

TanH		$f(x)=\tanh(x)={\frac {(e^{x}-e^{-x})}{(e^{x}+e^{-x})}}$

Rectified linear unit (ReLU)		$f(x)={\begin{cases}0&{\text{for }}x<0\\x&{\text{for }}x\geq 0\end{cases}}$

Multi-Layer Perceptron (다층 퍼셉트론)
: https://en.wikipedia.org/wiki/Multilayer_perceptron

: 복수의 Perceptron을 연결한 구조

: Non-linear Activation function + Multi-Layer

Artificial Neural Network (인공 신경망)
: https://en.wikipedia.org/wiki/Artificial_neural_network

: 다수의 Neuron이 Layer로 연결되어 복잡한 문제를 해결

: Input Layer > Hidden Layer > Output Layer 로 구성

- Input Layer : 초기값을 받는 가장 첫번째 Layer

- Hidden Layer : 중간 단계의 모든 Layer

- Output Layer : 마지막 Layer로 출력값 계산

. 결과값을 그대로 받아 Regression

. Sigmoid를 거쳐 Binary Classification

. Softmax를 거쳐 K-Class Classification

: Forward Propagation

: Back Propagation Algorithm

=> https://google-developers.appspot.com/machine-learning/crash-course/backprop-scroll/

Neural Network Learning Process
1) Initialization
: 학습하고자 하는 Parameter(θ) 초기값 선정
: Xavier Initialization (for Sigmoid, tanh), He Initialization (for ReLU)

2) Cost Function
: 함수 정의 (Cost Function을 최소화 하도록 학습 진행)
: 용도에 따라 Least Square Error 또는 Cross-Entropy를 사용함.
: Parameter의 많아질 수록 overfitting 현상이 심화 되어 Cost Function + Regularization Term 추가
> https://en.wikipedia.org/wiki/Regularization_(mathematics)

\min _{f}\sum _{i=1}^{n}V(f(x_{i}),y_{i})+\lambda R(f)

: Overfitting을 방지하기 위한 방법으로 Dropout 방법도 있음.
> https://en.wikipedia.org/wiki/Convolutional_neural_network#Dropout

3) Optimizer
: Cost Function을 최소화하는 방향으로 Parameter(θ)를 변경하는 학습 방식을 적용
: 일반적으로 Gradient Descent를 이용한 방식 사용
: Batch normalization, Optimizer (Adam)

: https://www.slideshare.net/HeeWonPark11/ss-80653977

2018년 7월 25일 수요일

머신러닝 관련 note, links

배운것 기억나는 선에서 대충 간단하게 정리함.
틀릴 수 있음. cross check 필요

http://charlie0301.blogspot.com/2018/07/ai-machine-learning-links.html

Machine learning
: https://en.wikipedia.org/wiki/Machine_learning

Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.

머신러닝 분류

: https://www.techleer.com/articles/203-machine-learning-algorithm-backbone-of-emerging-technologies/

- Supervised Learning (지도 학습)

: Input에 대한 Output을 예측하기 위해 학습, 정답이 존재

- Unsupervised Learning (비지도 학습)

: Input 데이터에서 패턴을 발견하는 것, 정답이 없음

- Reinforcement Learning (강화 학습)

: Trial & Error를 통한 학습

자세한 내용은 아래 링크 참조

머신 러닝(Machine Learning) 알고리즘 분류 – 지도 학습(Supervised Learning), 비지도 학습(Unsupervised Learning), 강화 학습(Reinforcement Learning) by Solaris

: http://solarisailab.com/archives/1785

Output에 따른 분류

- Regression
: 결과가 연속값(Continuous value)으로 주어짐, 모델이 그리는 선을 기반하여 결과값을 예측

- Logistic Regression, Binary Classification
: 결과가 양자 택일 값(Binary value)임

- Clustering
: 결과가 여러개의 불연속값(Multiple discrete value)임

Machine Learning algorithms/methods 간략 설명

: http://usblogs.pwc.com/emerging-technology/machine-learning-methods-infographic/

: https://docs.microsoft.com/ko-kr/azure/machine-learning/studio/algorithm-choice#algorithm-notes

- Linear Regression (선형 회귀)

: y와 한개 이상의 x와의 상관관계를 모델링
: Cost Function (비용 함수), Linear Regression의 경우 Mean square error function(평균 제곱 오차 함수)을 활용
=> Gradient Descent Algorithm(경사하강법, 미분으로 경사를 확인하여 음의 방향으로 이동하여 다시 계산)을 사용해서 Cost가 최소가 되는 지점을 찾음. 이때 적절한 이동을 위해 Learning Rate(학습률)이 중요함.

- Logistic Regression (로지스틱 회귀)

: 이진 분류(binary classification) 문제를 해결하기 위한 모델
: Sigmoid Function을 이용하여 특정 데이터가 positive/negative class에 속할 확률을 계산
: Cross-entropy를 비용함수로 설정하고 Gradient-based optimizer를 통해 학습을 진행함.

- Softmax Algorithm (소프트맥스 알고리즘)

: 다중 클래스 분류 문제(Multi-class)를 위한 알고리즘
: Logistic Regression을 변형/발전시킨 방법으로 binary class에서 multiple class 문제로 일반화

- Support Vector Machine (SVM, 서포트 벡터 머신)

: 패턴 인식을 위한 지도 학습 모델, 주로 분류를 위해 사용
: Soft-Margin SVM => margin을 최대화 하는 분류 경계면을 찾는 기법
=> Plus & Minus plain에 여유 변수를 두어 Robustness를 향상
: Kernel Support Vector Machines => margin을 최대화 하는 분류 경계면을 찾는 기법
=> 데이터가 선형적으로 분리되지 않ㅇ르 경우 고차원 공간으로 변환하여 해결

등등등...

2018년 7월 24일 화요일

AI 관련 note, links

배운것 기억나는 선에서 대충 간략히 정리함.
틀릴 수 있음. cross check 필요

Artificial intelligence

: https://en.wikipedia.org/wiki/Artificial_intelligence

Artificial intelligence (AI), sometimes called machine intelligence, is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and other animals. In computer science AI research is defined as the study of "intelligent agents": any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.

Machine learning

: https://en.wikipedia.org/wiki/Machine_learning

Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.

Deep learning

: https://en.wikipedia.org/wiki/Deep_learning

Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, semi-supervised or unsupervised.

뭔소린지 모르겠음.

아래 NVIDIA 글이 설명이 잘 되어 있음.

인공 지능과 머신 러닝, 딥 러닝의 차이점을 알아보자, NVIDIA KOREA

: http://blogs.nvidia.co.kr/2016/08/03/difference_ai_learning_machinelearning/

What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?, MICHAEL COPELAND

: https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/

위 기사에서 발췌

인간의 감각, 사고력을 지닌 채 인간처럼 생각하는 인공 지능을 ‘일반 AI(General AI)’라고 하지만, 현재의 기술 발전 수준에서 만들 수 있는 인공지능은 ‘좁은 AI(Narrow AI)’의 개념에 포함됩니다. 좁은 AI는 소셜 미디어의 이미지 분류 서비스나 얼굴 인식 기능 등과 같이 특정 작업을 인간 이상의 능력으로 해낼 수 있는 것이 특징이죠.

머신 러닝은 기본적으로 알고리즘을 이용해 데이터를 분석하고, 분석을 통해 학습하며, 학습한 내용을 기반으로 판단이나 예측을 합니다.

딥 러닝은 인공신경망에서 발전한 형태의 인공 지능으로, 뇌의 뉴런과 유사한 정보 입출력 계층을 활용해 데이터를 학습합니다.

- 머신러닝과 딥러닝의 주요 차이점 중 하나는 머신러닝을 위해서는 Feature 제공하여 학습을 하는데 반해 딥러닝은 모델 내부에 Feature extractor가 존재함.

: https://www.analyticsvidhya.com/blog/2017/04/comparison-between-deep-learning-machine-learning/

참고 사이트

- 모두를 위한 머신러닝/딥러닝 강의

: http://hunkim.github.io/ml/

- 머신러닝 단기집중과정

: https://developers.google.com/machine-learning/crash-course/

- Machine Learning 강의노트
: https://wikidocs.net/book/587

- 밑바닥부터 시작하는 딥러닝

: https://github.com/WegraLee/deep-learning-from-scratch
- [개앞맵시] 스카이넷도 딥러닝부터

: https://www.mindmeister.com/ko/812276967/_?fullscreen=1

2018년 7월 23일 월요일

[Links] Pandas links, 복습...

이미 정리된 사이트들이 많아 그냥 찾아서 링크함.

10 Minutes to pandas
: https://pandas.pydata.org/pandas-docs/stable/10min.html

Python - Pandas 튜토리얼 1(데이터프레임 생성, 접근, 삭제, 수정), Deep Play
: http://3months.tistory.com/292

PANDAS(판다스) 기초 정리
: http://doorbw.tistory.com/172

복습으로 서울시 인구 수를 Pandas를 가지고 확인해봄.
제대로 했는지 모르겠음.
: https://colab.research.google.com/github/hallower/pandas_study/blob/master/seoul_people_count/Age.ipynb