Learn with Random-Forest and compare Cross-Validation methods

This example shows how to make a classification with different cross-validation methods.

Import librairies

from museotoolbox.ai import SuperLearner
from museotoolbox import cross_validation
from museotoolbox.processing import extract_ROI
from museotoolbox import datasets
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import StratifiedKFold

Load HistoricalMap dataset

raster,vector = datasets.load_historical_data(low_res=True)
field = 'Class'
group = 'uniquefid'
X,y,g = extract_ROI(raster,vector,field,group)

Initialize Random-Forest

classifier = RandomForestClassifier(random_state=12,n_jobs=1)

Create list of different CV

CVs = [cross_validation.RandomStratifiedKFold(n_splits=2),
       cross_validation.LeavePSubGroupOut(valid_size=0.5),
       cross_validation.LeaveOneSubGroupOut(),
       StratifiedKFold(n_splits=2,shuffle=True) #from sklearn
       ]

kappas=[]



for cv in CVs :
    SL = SuperLearner( classifier=classifier,param_grid=dict(n_estimators=[50,100]),n_jobs=1)
    SL.fit(X,y,group=g,cv=cv)
    print('Kappa for '+str(type(cv).__name__))
    cvKappa = []

    for stats in SL.get_stats_from_cv(confusion_matrix=False,kappa=True):
        print(stats['kappa'])
        cvKappa.append(stats['kappa'])

    kappas.append(cvKappa)

    print(20*'=')

Out:

Kappa for RandomStratifiedKFold
0.9177889428851054
0.8989253671111543
====================
Kappa for LeavePSubGroupOut
0.7948119033434562
0.7078871023125154
====================
Kappa for LeaveOneSubGroupOut
0.8970485707645829
0.7571515766489976
====================
Kappa for StratifiedKFold
0.9105668061997964
0.8981956129176757
====================

Plot example

from matplotlib import pyplot as plt
plt.title('Kappa according to Cross-validation methods')
plt.boxplot(kappas,labels=[str(type(i).__name__) for i in CVs], patch_artist=True)
plt.grid()
plt.ylabel('Kappa')
plt.xticks(rotation=15)
plt.show()
../../_images/sphx_glr_learnWithRFandCompareCV_001.png

Total running time of the script: ( 0 minutes 5.926 seconds)

Gallery generated by Sphinx-Gallery