PK 1PV/ / ai/learnWithRFandCompareCV.ipynb{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\nLearn with Random-Forest and compare Cross-Validation methods\n===============================================================\n\nThis example shows how to make a classification with different cross-validation methods.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import librairies\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from museotoolbox.ai import SuperLearner\nfrom museotoolbox import cross_validation\nfrom museotoolbox.processing import extract_ROI\nfrom museotoolbox import datasets\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.model_selection import StratifiedKFold" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load HistoricalMap dataset\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "raster,vector = datasets.load_historical_data(low_res=True)\nfield = 'Class'\ngroup = 'uniquefid'\nX,y,g = extract_ROI(raster,vector,field,group)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialize Random-Forest\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "classifier = RandomForestClassifier(random_state=12,n_jobs=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create list of different CV\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "CVs = [cross_validation.RandomStratifiedKFold(n_splits=2),\n cross_validation.LeavePSubGroupOut(valid_size=0.5),\n cross_validation.LeaveOneSubGroupOut(),\n StratifiedKFold(n_splits=2,shuffle=True) #from sklearn\n ]\n\nkappas=[]\n\n\n\nfor cv in CVs : \n SL = SuperLearner( classifier=classifier,param_grid=dict(n_estimators=[50,100]),n_jobs=1)\n SL.fit(X,y,group=g,cv=cv)\n print('Kappa for '+str(type(cv).__name__))\n cvKappa = []\n \n for stats in SL.get_stats_from_cv(confusion_matrix=False,kappa=True):\n print(stats['kappa'])\n cvKappa.append(stats['kappa'])\n \n kappas.append(cvKappa)\n \n print(20*'=')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot example\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from matplotlib import pyplot as plt\nplt.title('Kappa according to Cross-validation methods')\nplt.boxplot(kappas,labels=[str(type(i).__name__) for i in CVs], patch_artist=True)\nplt.grid()\nplt.ylabel('Kappa')\nplt.xticks(rotation=15)\nplt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 0 }PK 1P.f0 0 ai/learnWithRFandRS50.ipynb{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\nLearn with Random-Forest and Random Sampling 50% (RS50)\n========================================================\n\nThis example shows how to make a Random Sampling with \n50% for each class.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import librairies\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from museotoolbox.ai import SuperLearner\nfrom museotoolbox.cross_validation import RandomStratifiedKFold\nfrom museotoolbox.processing import extract_ROI\nfrom museotoolbox import datasets\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn import metrics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load HistoricalMap dataset\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "raster,vector = datasets.load_historical_data(low_res=True)\nfield = 'Class'\nX,y = extract_ROI(raster,vector,field)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create CV\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SKF = RandomStratifiedKFold(n_splits=2,\n random_state=12,verbose=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialize Random-Forest and metrics\n--------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "classifier = RandomForestClassifier(random_state=12,n_jobs=1)\n\n# \nkappa = metrics.make_scorer(metrics.cohen_kappa_score)\nf1_mean = metrics.make_scorer(metrics.f1_score,average='micro')\nscoring = dict(kappa=kappa,f1_mean=f1_mean,accuracy='accuracy')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start learning\n---------------------------\nsklearn will compute different metrics, but will keep best results from kappa (refit='kappa')\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SL = SuperLearner(classifier=classifier,param_grid = dict(n_estimators=[10]),n_jobs=1,verbose=1)\n\nSL.fit(X,y,cv=SKF,scoring=kappa)\n\n\n# =============================================================================\n# ##############################################################################\n# # Read the model\n# # -------------------\n# print(SL.model)\n# print(SL.model.cv_results_)\n# print(SL.model.best_score_)\n# \n# ##############################################################################\n# # Get F1 for every class from best params\n# # -----------------------------------------------\n# \n# for stats in SL.get_stats_from_cv(confusion_matrix=False,F1=True):\n# print(stats['F1'])\n# \n# ##############################################################################\n# # Get each confusion matrix from folds\n# # -----------------------------------------------\n# \n# for stats in SL.get_stats_from_cv(confusion_matrix=True):\n# print(stats['confusion_matrix'])\n# \n# ##############################################################################\n# # Save each confusion matrix from folds\n# # -----------------------------------------------\n# \n# SL.save_cm_from_cv('/tmp/testMTB/',prefix='RS50_')\n# \n# =============================================================================" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Predict map\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SL.predict_image(raster,'/tmp/classification.tif',\n higher_confidence='/tmp/confidence.tif',\n confidence_per_class='/tmp/confidencePerClass.tif')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot example\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from matplotlib import pyplot as plt\nfrom osgeo import gdal\nsrc=gdal.Open('/tmp/classification.tif')\nplt.imshow(src.GetRasterBand(1).ReadAsArray(),cmap=plt.get_cmap('tab20'))\nplt.axis('off')\nplt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 0 }PK 1PgL+ + ai/SFFS.ipynb{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\nSequential Forward Feature Selection (SFFS)\n========================================================\n\nThis example shows how to make a Random Sampling with \n50% for each class.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import librairies\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from museotoolbox.ai import SequentialFeatureSelection\nfrom museotoolbox.cross_validation import LeavePSubGroupOut\nfrom museotoolbox import datasets\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn import metrics\nimport numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load HistoricalMap dataset\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "X,y,g = datasets.load_historical_data(return_X_y_g=True,low_res=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create CV\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "LSGO = LeavePSubGroupOut(valid_size=0.8,n_repeats=2,\n random_state=12,verbose=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialize Random-Forest and metrics\n--------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "classifier = RandomForestClassifier(random_state=12,n_jobs=1)\n\nf1 = metrics.make_scorer(metrics.f1_score)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set and fit the Sequentia Feature Selection\n---------------------------------------------------------------\n\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SFFS = SequentialFeatureSelection(classifier=classifier,param_grid=dict(n_estimators=[10,20]),verbose=False)\n\nSFFS.fit(X.astype(np.float),y,g,cv=LSGO,max_features=3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Show best features and score\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "print('Best features are : '+str(SFFS.best_features_))\nprint('F1 are : '+str(SFFS.best_scores_))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to predict every classification from the best feature\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SFFS.predict_best_combination(datasets.load_historical_data()[0],'/tmp/SFFS/best_classification.tif')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot example\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from matplotlib import pyplot as plt\nplt.plot(np.arange(1,len(SFFS.best_scores_)+1),SFFS.best_scores_)\nplt.xlabel('Number of features')\nplt.xticks(np.arange(1,len(SFFS.best_scores_)+1))\nplt.ylabel('F1')\nplt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 0 }PK 1PhHL L ai/learnWithCustomRaster.ipynb{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\nLearn algorithm and customize your input raster without writing it on disk\n=============================================================================\n\nThis example shows how to customize your raster (ndvi, smooth signal...) in the \nlearning process to avoi generate a new raster.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import librairies\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from museotoolbox.ai import SuperLearner\nfrom museotoolbox.processing import extract_ROI\nfrom museotoolbox import datasets\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn import metrics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load HistoricalMap dataset\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "raster,vector = datasets.load_historical_data(low_res=True)\nfield = 'Class'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialize Random-Forest and metrics\n--------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "classifier = RandomForestClassifier(random_state=12,n_jobs=1)\n\nkappa = metrics.make_scorer(metrics.cohen_kappa_score)\nf1_mean = metrics.make_scorer(metrics.f1_score,average='micro')\nscoring = dict(kappa=kappa,f1_mean=f1_mean,accuracy='accuracy')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start learning\n---------------------------\nsklearn will compute different metrics, but will keep best results from kappa (refit='kappa')\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SL = SuperLearner(classifier=classifier,param_grid=dict(n_estimators=[10]),n_jobs=1,verbose=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create or use custom function\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def reduceBands(X,bandToKeep=[0,2]):\n # this function get the first and the last band\n X=X[:,bandToKeep].reshape(-1,len(bandToKeep))\n return X\n\n# add this function to learnAndPredict class\nSL.customize_array(reduceBands)\n\n# if you learn from vector, refit according to the f1_mean\nX,y = extract_ROI(raster,vector,field)\nSL.fit(X,y,cv=2,scoring=scoring,refit='f1_mean')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read the model\n-------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "print(SL.model)\nprint(SL.model.cv_results_)\nprint(SL.model.best_score_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get F1 for every class from best params\n-----------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "for stats in SL.get_stats_from_cv(confusion_matrix=False,F1=True):\n print(stats['F1'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get each confusion matrix from folds\n-----------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "for stats in SL.get_stats_from_cv(confusion_matrix=True):\n print(stats['confusion_matrix'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Save each confusion matrix from folds\n-----------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SL.save_cm_from_cv('/tmp/testMTB/',prefix='RS50_')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Predict map\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SL.predict_image(raster,'/tmp/classification.tif',\n higher_confidence='/tmp/confidence.tif',\n confidence_per_class='/tmp/confidencePerClass.tif')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot example\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from matplotlib import pyplot as plt\nfrom osgeo import gdal\nsrc=gdal.Open('/tmp/classification.tif')\nplt.imshow(src.GetRasterBand(1).ReadAsArray(),cmap=plt.get_cmap('tab20'))\nplt.axis('off')\nplt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 0 }PK 1Px charts/plotConfusionF1.ipynb{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\nPlot confusion matrix from Cross-Validation with F1\n========================================================\n\nPlot confusion matrix from Cross-Validation, with F1 as subplot.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import librairies\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from museotoolbox.ai import SuperLearner\nfrom museotoolbox.cross_validation import RandomStratifiedKFold\nfrom museotoolbox.charts import PlotConfusionMatrix\nfrom museotoolbox import datasets\nfrom sklearn.ensemble import RandomForestClassifier" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load HistoricalMap dataset\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "X,y = datasets.load_historical_data(low_res=True,return_X_y=True)\nfield = 'Class'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create CV\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "RSKF = RandomStratifiedKFold(n_splits=2,\n random_state=12,verbose=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialize Random-Forest\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "classifier = RandomForestClassifier()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start learning\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SL = SuperLearner(classifier=classifier,param_grid=dict(n_estimators=[10,50]))\nSL.fit(X,y,cv=RSKF)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get kappa from each fold\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "for stats in SL.get_stats_from_cv(confusion_matrix=False,kappa=True):\n print(stats['kappa'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get each confusion matrix from folds\n-----------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cms = []\nfor stats in SL.get_stats_from_cv(confusion_matrix=True):\n cms.append(stats['confusion_matrix'])\n print(stats['confusion_matrix'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot confusion matrix\n-----------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\nmeanCM = np.mean(cms,axis=0).astype(np.int16)\npltCM = PlotConfusionMatrix(meanCM.T) # Translate for Y = prediction and X = truth\npltCM.add_text()\npltCM.add_f1()\npltCM.color_diagonal()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 0 }PK 1PZ\ charts/plotConfusion.ipynb{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\nPlot confusion matrix\n========================================================\n\nPlot confusion matrix from Cross-Validation, with F1 as subplot.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import librairies\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from museotoolbox.ai import SuperLearner\nfrom museotoolbox.cross_validation import RandomStratifiedKFold\nfrom museotoolbox.charts import PlotConfusionMatrix\nfrom museotoolbox import datasets\nfrom sklearn.ensemble import RandomForestClassifier" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load HistoricalMap dataset\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "X,y = datasets.load_historical_data(low_res=True,return_X_y=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create CV\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "RSKF = RandomStratifiedKFold(n_splits=2,\n random_state=12,verbose=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialize Random-Forest\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "classifier = RandomForestClassifier()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start learning\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SL = SuperLearner(classifier=classifier,param_grid=dict(n_estimators=[10,50]))\nSL.fit(X,y,cv=RSKF)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get kappa from each fold\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "for stats in SL.get_stats_from_cv(confusion_matrix=False,kappa=True):\n print(stats['kappa'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get each confusion matrix from folds\n-----------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cms = []\nfor stats in SL.get_stats_from_cv(confusion_matrix=True):\n cms.append(stats['confusion_matrix'])\n print(stats['confusion_matrix'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot confusion matrix\n-----------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\nmeanCM = np.mean(cms,axis=0).astype(np.int16)\npltCM = PlotConfusionMatrix(meanCM.T) # Translate for Y = prediction and X = truth\npltCM.add_text()\npltCM.color_diagonal()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 0 }PK 1P|u3S charts/plotConfusionAcc.ipynb{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\nPlot confusion matrix with User/Producer accuracy\n========================================================\n\nPlot confusion matrix from Cross-Validation, with accuracy (user/prod) as subplot.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import librairies\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from museotoolbox.ai import SuperLearner\nfrom museotoolbox.cross_validation import RandomStratifiedKFold\nfrom museotoolbox.charts import PlotConfusionMatrix\nfrom museotoolbox import datasets\nfrom sklearn.ensemble import RandomForestClassifier" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load HistoricalMap dataset\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "X,y = datasets.load_historical_data(low_res=True,return_X_y=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create CV\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "RSKF = RandomStratifiedKFold(n_splits=2,\n random_state=12,verbose=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialize Random-Forest\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "classifier = RandomForestClassifier()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start learning\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SL = SuperLearner(classifier=classifier,param_grid=dict(n_estimators=[10,100]))\nSL.fit(X,y,cv=RSKF)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get kappa from each fold\n---------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "for stats in SL.get_stats_from_cv(confusion_matrix=False,kappa=True):\n print(stats['kappa'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get each confusion matrix from folds\n-----------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cms = []\nfor stats in SL.get_stats_from_cv(confusion_matrix=True):\n cms.append(stats['confusion_matrix'])\n print(stats['confusion_matrix'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot confusion matrix\n-----------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\n\n# a bug in Sphinx doesn't show the whole plot, sorry.\n\nlabels = ['Forest','Agriculture','Bare soil','Water','Building']\nfrom matplotlib.pyplot import cm as colorMap\nmeanCM = np.mean(cms,axis=0).astype(np.int16)\npltCM = PlotConfusionMatrix(meanCM.T) # Translate for Y = prediction and X = truth\npltCM.add_text()\npltCM.add_x_labels(labels,rotation=90)\npltCM.add_y_labels(labels)\npltCM.color_diagonal(diag_color=colorMap.Purples,matrix_color=colorMap.Reds)\npltCM.add_accuracy()\npltCM.add_f1()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 0 }PK 2PO馺 + cross_validation/SpatialLeaveAsideOut.ipynb{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\nSpatial Leave-Aside-Out (SLAO)\n======================================================\n\nThis example shows how to make a Spatial Leave-Aside-Out.\n\nSee https://doi.org/10.1016/j.foreco.2013.07.059\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import librairies\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from museotoolbox.cross_validation import SpatialLeaveAsideOut\nfrom museotoolbox import datasets,processing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load HistoricalMap dataset\n-------------------------------------------\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "raster,vector = datasets.load_historical_data()\nfield = 'Class'\nX,y = processing.extract_ROI(raster,vector,field)\ndistance_matrix = processing.get_distance_matrix(raster,vector)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create CV\n-------------------------------------------\nn_splits will be the number of the least populated class\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "SLOPO = SpatialLeaveAsideOut(valid_size=1/3,\n distance_matrix=distance_matrix,random_state=4)\n\nprint(SLOPO.get_n_splits(X,y))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
Split is made to generate each fold
Split is made to generate each fold
Split is made to generate each fold
Split is made to generate each fold
Split is made to generate each fold
Split is made to generate each fold
Split is made to generate each fold
The first one is made with polygon only.\n When learning/predicting, all pixels will be taken in account\n TO generate a full X and y labels, extract samples from ROI
Split is made to generate each fold