SuperLearner

class museotoolbox.ai.SuperLearner(classifier, param_grid=None, n_jobs=1, verbose=False)[source]

SuperLearner, shortname for Supervised Learning, ease the way to learn a model via an array or a raster using Scikit-Learn algorithm. After learning a model via fit(), you can predict via predict_image() or predict_array().

Parameters
  • classifier (algorithm compatible with scikit-learn.) – For example RandomForestClassifier(n_estimators=100) from from sklearn.ensemble import RandomForestClassifier

  • param_grid (False or dict, optional (default=False)) – param_grid for the grid_search. E.g. for RandomForestClassifier : param_grid=dict(n_estimators=[10,100],max_features=[1,3])

  • n_jobs (int, default 1.) – Number of cores to be used by sklearn in grid-search.

  • verbose (bool or int, optional (default=False)) – The higher it is the more sequential will show progression.

Examples

>>> import museotoolbox as mtb
>>> from sklearn.ensemble import RandomForestClassifier
>>> X,y = mtb.datasets.load_historical_data(return_X_y=True)
>>> RS50 = mtb.cross_validation.RandomStratifiedKFold(n_splits=2,n_repeats=5,
        random_state=12,verbose=False)
>>> classifier = RandomForestClassifier()
>>> SL = mtb.ai.SuperLearner(verbose=True,classifier=classifier)
>>> SL.fit(X,y,cv=RS50,param_grid=dict(n_estimators=[100,200]))
Fitting 10 folds for each of 2 candidates, totalling 20 fits
best score : 0.966244859222
best n_estimators : 200
>>> for kappa in SL.get_stats_from_cv(confusion_matrix=False,kappa=True):
    print(kappa)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
{'kappa': 0.94145803865870303}
{'kappa': 0.94275572196698443}
{'kappa': 0.94566553229314054}
{'kappa': 0.94210064101370472}
{'kappa': 0.94566137634353153}
{'kappa': 0.94085890364956737}
{'kappa': 0.94136385707385184}
{'kappa': 0.9383201352573155}
{'kappa': 0.93887726891376944}
{'kappa': 0.94450020549861891}
[Parallel(n_jobs=-1)]: Done  10 out of  10 | eSLsed:    8.7s finished
>>> SL.predict_image(raster,'/tmp/classification.tif')
Total number of blocks : 15
Prediction...  [########################################]100%
Saved /tmp/classification.tif using function predictArray

Methods

__init__(classifier[, param_grid, n_jobs, …])

SuperLearner, shortname for Supervised Learning, ease the way to learn a model via an array or a raster using Scikit-Learn algorithm.

customize_array(xFunction, **kwargs)

fit(X, y[, group, standardize, cv, scoring, …])

Fit model from array.

get_stats_from_cv([confusion_matrix, kappa, …])

Extract statistics from the Cross-Validation.

load_model(path)

Load model previously saved with SuperLearner.save_model(path).

predict_array(X)

Predict label from array.

predict_confidence_per_class(X)

Predict confidence for each class.

predict_higher_confidence(X)

Get confidence of the predicted label.

predict_image(in_image, out_image[, …])

Predict label from raster using previous learned model.

save_cm_from_cv(savePath[, prefix, header, …])

Save each confusion matrix (csv format) from cross-validation.

save_model(path)

Save model ‘myModel.npz’ to be loaded later via SuperLearner.load_model(path)

standardize_array([X])

Scale X data using StandardScaler from sklearn.