Python_scikit_learn
http://scikit-learn.org/stable/User Guide
http://scikit-learn.org/stable/user_guide.htmlOutline:
1.1. Generalized Linear Models
http://scikit-learn.org/stable/modules/linear_model.html1.1.11. Logistic regression
http://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
sklearn.linear_model.LogisticRegression
class
sklearn.linear_model.
LogisticRegression
(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True,intercept_scaling=1, class_weight=None, random_state=None, solver='liblinear', max_iter=100, multi_class='ovr',verbose=0, warm_start=False, n_jobs=1)decision_function (X) | Predict confidence scores for samples. |
densify () | Convert coefficient matrix to dense array format. |
fit (X, y[, sample_weight]) | Fit the model according to the given training data. |
fit_transform (X[, y]) | Fit to data, then transform it. |
get_params ([deep]) | Get parameters for this estimator. |
predict (X) | Predict class labels for samples in X. |
predict_log_proba (X) | Log of probability estimates. |
predict_proba (X) | Probability estimates. |
score (X, y[, sample_weight]) | Returns the mean accuracy on the given test data and labels. |
set_params (**params) | Set the parameters of this estimator. |
sparsify () | Convert coefficient matrix to sparse format. |
transform (*args, **kwargs) | DEPRECATED: Support to use estimators as feature selectors will be removed in version 0.19. |
Attributes: |
coef_ : array, shape (n_classes, n_features)
intercept_ : array, shape (n_classes,)
n_iter_ : array, shape (n_classes,) or (1, )
|
---|
sklearn.linear_model.Perceptron
classsklearn.linear_model.
Perceptron
(penalty=None, alpha=0.0001, fit_intercept=True, n_iter=5, shuffle=True,verbose=0, eta0=1.0, n_jobs=1, random_state=0, class_weight=None, warm_start=False)decision_function (X) | Predict confidence scores for samples. |
densify () | Convert coefficient matrix to dense array format. |
fit (X, y[, coef_init, intercept_init, ...]) | Fit linear model with Stochastic Gradient Descent. |
fit_transform (X[, y]) | Fit to data, then transform it. |
get_params ([deep]) | Get parameters for this estimator. |
partial_fit (X, y[, classes, sample_weight]) | Fit linear model with Stochastic Gradient Descent. |
predict (X) | Predict class labels for samples in X. |
score (X, y[, sample_weight]) | Returns the mean accuracy on the given test data and labels. |
set_params (*args, **kwargs) | |
sparsify () | Convert coefficient matrix to sparse format. |
transform (*args, **kwargs) | DEPRECATED: Support to use estimators as feature selectors will be removed in version 0.19. |
1.6. Nearest Neighbors
http://scikit-learn.org/stable/modules/neighbors.htmlneighbors.NearestNeighbors ([n_neighbors, ...]) | Unsupervised learner for implementing neighbor searches. |
neighbors.KNeighborsClassifier ([...]) | Classifier implementing the k-nearest neighbors vote. |
neighbors.RadiusNeighborsClassifier ([...]) | Classifier implementing a vote among neighbors within a given radius |
neighbors.KNeighborsRegressor ([n_neighbors, ...]) | Regression based on k-nearest neighbors. |
neighbors.RadiusNeighborsRegressor ([radius, ...]) | Regression based on neighbors within a fixed radius. |
neighbors.NearestCentroid ([metric, ...]) | Nearest centroid classifier. |
neighbors.BallTree | BallTree for fast generalized N-point problems |
neighbors.KDTree | KDTree for fast generalized N-point problems |
neighbors.LSHForest ([n_estimators, radius, ...]) | Performs approximate nearest neighbor search using LSH forest. |
neighbors.DistanceMetric | DistanceMetric class |
neighbors.KernelDensity ([bandwidth, ...]) | Kernel Density Estimation |
neighbors.kneighbors_graph (X, n_neighbors[, ...]) | Computes the (weighted) graph of k-Neighbors for points in X |
neighbors.radius_neighbors_graph (X, radius) | Computes the (weighted) graph of Neighbors for points in X |
sklearn.neighbors.KNeighborsClassifier
classsklearn.neighbors.
KNeighborsClassifier
(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', metric_params=None, n_jobs=1, **kwargs)1.13. Feature selection
http://scikit-learn.org/stable/modules/feature_selection.htmlThe
sklearn.feature_selection
module implements feature selection algorithms. It currently includes univariate filter selection methods and the recursive feature elimination algorithm.feature_selection.GenericUnivariateSelect ([...]) | Univariate feature selector with configurable strategy. |
feature_selection.SelectPercentile ([...]) | Select features according to a percentile of the highest scores. |
feature_selection.SelectKBest ([score_func, k]) | Select features according to the k highest scores. |
feature_selection.SelectFpr ([score_func, alpha]) | Filter: Select the pvalues below alpha based on a FPR test. |
feature_selection.SelectFdr ([score_func, alpha]) | Filter: Select the p-values for an estimated false discovery rate |
feature_selection.SelectFromModel (estimator) | Meta-transformer for selecting features based on importance weights. |
feature_selection.SelectFwe ([score_func, alpha]) | Filter: Select the p-values corresponding to Family-wise error rate |
feature_selection.RFE (estimator[, ...]) | Feature ranking with recursive feature elimination. |
feature_selection.RFECV (estimator[, step, ...]) | Feature ranking with recursive feature elimination and cross-validated selection of the best number of features. |
feature_selection.VarianceThreshold ([threshold]) | Feature selector that removes all low-variance features. |
feature_selection.chi2 (X, y) | Compute chi-squared stats between each non-negative feature and class. |
feature_selection.f_classif (X, y) | Compute the ANOVA F-value for the provided sample. |
feature_selection.f_regression (X, y[, center]) | Univariate linear regression tests. |
feature_selection.mutual_info_classif (X, y) | Estimate mutual information for a discrete target variable. |
feature_selection.mutual_info_regression (X, y) | Estimate mutual information for a continuous target variable. |
3.1. Cross-validation: evaluating estimator performance
sklearn.model_selection.train_test_split(*arrays, **options)
Split arrays or matrices into random train and test subsetsQuick utility that wraps input validation and next(ShuffleSplit().split(X, y)) and application to input data into a single call for splitting (and optionally subsampling) data in a oneliner.
4.8.2. Label encoding
http://scikit-learn.org/stable/modules/preprocessing_targets.html#label-encoding
class
sklearn.preprocessing.LabelEncode
class
sklearn.preprocessing.
LabelEncoder
Encode labels with value between 0 and n_classes-1.
fit (y) | Fit label encoder |
fit_transform (y) | Fit label encoder and return encoded labels |
get_params ([deep]) | Get parameters for this estimator. |
inverse_transform (y) | Transform labels back to original encoding. |
set_params (**params) | Set the parameters of this estimator. |
transform (y) | Transform labels to normalized encoding. |
4.3.4. Encoding categorical features
sklearn.preprocessing.OneHotEncoder
classsklearn.preprocessing.
OneHotEncoder
(n_values='auto', categorical_features='all', dtype=<type 'numpy.float64'>, sparse=True, handle_unknown='error')
Encode categorical integer features using a one-hot aka one-of-K scheme.
4.3.1.1. Scaling features to a range
http://scikit-learn.org/stable/modules/preprocessing.html#scaling-features-to-a-rangesklearn.preprocessing.MinMaxScaler
classsklearn.preprocessing.
MinMaxScaler
(feature_range=(0, 1), copy=True)
Transforms features by scaling each feature to a given range.
This estimator scales and translates each feature individually such that it is in the given range on the training set, i.e. between zero and one.
The transformation is given by:
sklearn.base: Base classes and utility functions
Base classes for all estimators.Base classes
base.BaseEstimator | Base class for all estimators in scikit-learn |
base.ClassifierMixin | Mixin class for all classifiers in scikit-learn. |
base.ClusterMixin | Mixin class for all cluster estimators in scikit-learn. |
base.RegressorMixin | Mixin class for all regression estimators in scikit-learn. |
base.TransformerMixin | Mixin class for all transformers in scikit-learn. |
Functions
base.clone (estimator[, safe]) | Constructs a new estimator with the same parameters. |