sklearn - Standardization

[ Web ] https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html

Data의 정규화 또는 표준화이다. 즉, 표준정규분포 (standard normal distribution)을 갖는 data로 만들어주는 기능을 한다.

이렇게 만든 데이터는 평균이 0이고, 표준편차(1-sigma)가 1로 mapping되며, X축을 z-score 또는 standardized score라고 부른다.

Usage:

from sklearn.preprocessing import StandardScaler
import numpy as np

scaler = StandardScaler()

data = np.arange(11).reshape(-1,1)

scaler.fit(data) # Compute the mean and std to be used for later scaling.
scaler.transform(data) # Perform standardization by centering and scaling.

or 

dataScaled = scaler.fit_transform(data) # Fit to data, then transform it.

'''

data =
[[ 0]
 [ 1]
 [ 2]
 [ 3]
 [ 4]
 [ 5]
 [ 6]
 [ 7]
 [ 8]
 [ 9]
 [10]]

data.mean() = 5.0
data.std()  = 3.1622776601683795

dataScaled =
[[-1.58113883]
 [-1.26491106]
 [-0.9486833 ]
 [-0.63245553]
 [-0.31622777]
 [ 0.        ]
 [ 0.31622777]
 [ 0.63245553]
 [ 0.9486833 ]
 [ 1.26491106]
 [ 1.58113883]]

dataScaled.std() = 1.0

'''

Reference

- https://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_probability/bs704_probability9.html

- https://www.mathsisfun.com/data/standard-normal-distribution.html

저작자표시

'Library' 카테고리의 다른 글

numpy - ravel_multi_index (0)	2021.12.21
sklearn - template (0)	2021.07.01
sklearn - Scaler (0)	2021.06.23
Scikit-allel (0)	2020.11.06
Scikit-learn, sklearn (0)	2020.11.04

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Analytic reasoning

sklearn - Standardization

'Library' 카테고리의 다른 글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

sklearn - Standardization

'Library' 카테고리의 다른 글

관련글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역