RandomStratifiedKFold

class museotoolbox.cross_validation.RandomStratifiedKFold(n_splits=2, n_repeats=False, valid_size=False, random_state=False, verbose=False)[source]

Generate a Cross-Validation with full random selection and Stratified K-Fold (same percentange per class).

Parameters
  • n_splits (int, optional (default=2)) – Number of splits. 2 means 50% for each class at training and validation.

  • n_repeats (integer or False, optional (default=False)) – If False, will repeat n_splits once.

  • valid_size (int or False, optional (default=False)) – If False, valid size is 1 / n_splits.

  • random_state (integer or None, optional (default=False)) – If int, random_state is the seed used by the random number generator; If None, the random number generator is created with time.time().

  • verbose (integer or False, optional (default=False)) – Controls the verbosity: the higher the value is, the more the messages are detailed.

Example

>>> from museotoolbox.cross_validation import RandomStratifiedKFold
>>> from museotoolbox import datasets
>>> X,y = datasets.load_historical_data(return_X_y=True)
>>> RSK = RandomStratifiedKFold(n_splits=2,random_state=12,verbose=False)
>>> for tr,vl in RSK.split(X=X,y=y):
        print(tr,vl)
[ 1600  1601  1605 ...,  9509  9561 10322] [ 3632  1988 11480 ..., 10321  9457  9508]
[ 1599  1602  1603 ...,  9508  9560 10321] [ 3948 10928  3490 ..., 10322  9458  9561]

Manage cross-validartion methods to generate the duo valid/train samples.

split(X,y,g) : Function.

Get a memory cross validation to use directly in Scikit-Learn.

saveVectorFiles() : Need default output name (str).

To save as many vector files (train/valid) as your Cross Validation method outputs.

__get_supported_extensions() : Function.

Show you the list of supported vector extensions type when using saveVectorFiles function.

reinitialize() : Function.

If you need to regenerate the cross validation, you need to reinitialize it.

Methods

__init__([n_splits, n_repeats, valid_size, …])

Manage cross-validartion methods to generate the duo valid/train samples.

get_n_splits([X, y, groups])

Returns the number of splitting iterations in the cross-validator.

get_supported_extensions()

reinitialize()

save_to_vector(vector, field[, group, …])

Save to vector files each fold from the cross-validation.

split(X, y[, groups])

Split the vector/array according to y and groups.