[ Web ] https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html
Data의 정규화 또는 표준화이다. 즉, 표준정규분포 (standard normal distribution)을 갖는 data로 만들어주는 기능을 한다.
이렇게 만든 데이터는 평균이 0이고, 표준편차(1-sigma)가 1로 mapping되며, X축을 z-score 또는 standardized score라고 부른다.
Usage:
from sklearn.preprocessing import StandardScaler
import numpy as np
scaler = StandardScaler()
data = np.arange(11).reshape(-1,1)
scaler.fit(data) # Compute the mean and std to be used for later scaling.
scaler.transform(data) # Perform standardization by centering and scaling.
or
dataScaled = scaler.fit_transform(data) # Fit to data, then transform it.
'''
data =
[[ 0]
[ 1]
[ 2]
[ 3]
[ 4]
[ 5]
[ 6]
[ 7]
[ 8]
[ 9]
[10]]
data.mean() = 5.0
data.std() = 3.1622776601683795
dataScaled =
[[-1.58113883]
[-1.26491106]
[-0.9486833 ]
[-0.63245553]
[-0.31622777]
[ 0. ]
[ 0.31622777]
[ 0.63245553]
[ 0.9486833 ]
[ 1.26491106]
[ 1.58113883]]
dataScaled.std() = 1.0
'''
Reference
- https://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_probability/bs704_probability9.html
- https://www.mathsisfun.com/data/standard-normal-distribution.html
'Library' 카테고리의 다른 글
numpy - ravel_multi_index (0) | 2021.12.21 |
---|---|
sklearn - template (0) | 2021.07.01 |
sklearn - Scaler (0) | 2021.06.23 |
Scikit-allel (0) | 2020.11.06 |
Scikit-learn, sklearn (0) | 2020.11.04 |
댓글