Stratified-K-Fold

This example shows how to make a Random Sampling with 50% for each class.

Import librairies

from museotoolbox.cross_validation import RandomStratifiedKFold
from museotoolbox import datasets,processing

Load HistoricalMap dataset

raster,vector = datasets.load_historical_data(low_res=True)
field = 'Class'
y = processing.read_vector_values(vector,field)

Create CV

SKF = RandomStratifiedKFold(n_splits=2,n_repeats=2,
                random_state=12,verbose=False)
for tr,vl in SKF.split(X=None,y=y):
    print(tr,vl)

Out:

[ 2  3  8  6  9 15 16 12 13] [ 0  7  1  4  5 14 10 11]
[ 0  1  7  4  5 14 15 10 11] [ 3  2  8  9  6 16 12 13]
[ 0  3  7  4  9 14 16 12 13] [ 8  1  2  5  6 15 10 11]
[ 1  2  8  5  6 14 15 10 11] [ 7  3  0  9  4 16 12 13]

Note

Split is made to generate each fold

# Show label

for tr,vl in SKF.split(X=None,y=y):
    print(y[tr],y[vl])

Out:

[1 1 1 2 2 3 3 4 5] [1 1 1 2 2 3 4 5]
[1 1 1 2 2 3 3 4 5] [1 1 1 2 2 3 4 5]
[1 1 1 2 2 3 3 4 5] [1 1 1 2 2 3 4 5]
[1 1 1 2 2 3 3 4 5] [1 1 1 2 2 3 4 5]

Note

The first one is made with polygon only. When learning/predicting, all pixels will be taken in account TO generate a full X and y labels, extract samples from ROI

X,y=processing.extract_ROI(raster,vector,field)

for tr,vl in SKF.split(X,y):
    print(tr,vl)
    print(tr.shape,vl.shape)

Out:

[   0    1    2 ... 2961 3160 3161] [ 999  398 2667 ... 2843 2842 3023]
(1583,) (1579,)
[   3    4    5 ... 2960 3023 3160] [1093 2607 2672 ... 2834  715 3161]
(1583,) (1579,)
[   1    2    4 ... 2961 3023 3161] [1477   51 1805 ... 2883 2833 3160]
(1583,) (1579,)
[   0    3    6 ... 2960 3023 3160] [2331 2317  999 ...  391  508 3161]
(1583,) (1579,)

Plot example

from __drawCVmethods import plotMethod
plotMethod('SKF-pixel')
../../_images/sphx_glr_RandomSampling50_001.png

Out:

/home/docs/checkouts/readthedocs.org/user_builds/museotoolbox/checkouts/v0.12/examples/cross_validation/__drawCVmethods.py:35: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance.  In a future version, a new instance will always be created and returned.  Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
  ax = f.add_subplot(111)

Total running time of the script: ( 0 minutes 0.104 seconds)

Gallery generated by Sphinx-Gallery