.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/applications/plot_face_recognition.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code or to run this example in your browser via JupyterLite or Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_applications_plot_face_recognition.py: =================================================== Faces recognition example using eigenfaces and SVMs =================================================== The dataset used in this example is a preprocessed excerpt of the "Labeled Faces in the Wild", aka LFW_: http://vis-www.cs.umass.edu/lfw/lfw-funneled.tgz (233MB) .. _LFW: http://vis-www.cs.umass.edu/lfw/ .. GENERATED FROM PYTHON SOURCE LINES 15-29 .. code-block:: default from time import time import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.model_selection import RandomizedSearchCV from sklearn.datasets import fetch_lfw_people from sklearn.metrics import classification_report from sklearn.metrics import ConfusionMatrixDisplay from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA from sklearn.svm import SVC from scipy.stats import loguniform .. GENERATED FROM PYTHON SOURCE LINES 30-31 Download the data, if not already on disk and load it as numpy arrays .. GENERATED FROM PYTHON SOURCE LINES 31-53 .. code-block:: default lfw_people = fetch_lfw_people(min_faces_per_person=70, resize=0.4) # introspect the images arrays to find the shapes (for plotting) n_samples, h, w = lfw_people.images.shape # for machine learning we use the 2 data directly (as relative pixel # positions info is ignored by this model) X = lfw_people.data n_features = X.shape[1] # the label to predict is the id of the person y = lfw_people.target target_names = lfw_people.target_names n_classes = target_names.shape[0] print("Total dataset size:") print("n_samples: %d" % n_samples) print("n_features: %d" % n_features) print("n_classes: %d" % n_classes) .. GENERATED FROM PYTHON SOURCE LINES 54-55 Split into a training set and a test and keep 25% of the data for testing. .. GENERATED FROM PYTHON SOURCE LINES 55-64 .. code-block:: default X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.25, random_state=42 ) scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) .. GENERATED FROM PYTHON SOURCE LINES 65-67 Compute a PCA (eigenfaces) on the face dataset (treated as unlabeled dataset): unsupervised feature extraction / dimensionality reduction .. GENERATED FROM PYTHON SOURCE LINES 67-86 .. code-block:: default n_components = 150 print( "Extracting the top %d eigenfaces from %d faces" % (n_components, X_train.shape[0]) ) t0 = time() pca = PCA(n_components=n_components, svd_solver="randomized", whiten=True).fit(X_train) print("done in %0.3fs" % (time() - t0)) eigenfaces = pca.components_.reshape((n_components, h, w)) print("Projecting the input data on the eigenfaces orthonormal basis") t0 = time() X_train_pca = pca.transform(X_train) X_test_pca = pca.transform(X_test) print("done in %0.3fs" % (time() - t0)) .. GENERATED FROM PYTHON SOURCE LINES 87-88 Train a SVM classification model .. GENERATED FROM PYTHON SOURCE LINES 88-104 .. code-block:: default print("Fitting the classifier to the training set") t0 = time() param_grid = { "C": loguniform(1e3, 1e5), "gamma": loguniform(1e-4, 1e-1), } clf = RandomizedSearchCV( SVC(kernel="rbf", class_weight="balanced"), param_grid, n_iter=10 ) clf = clf.fit(X_train_pca, y_train) print("done in %0.3fs" % (time() - t0)) print("Best estimator found by grid search:") print(clf.best_estimator_) .. GENERATED FROM PYTHON SOURCE LINES 105-106 Quantitative evaluation of the model quality on the test set .. GENERATED FROM PYTHON SOURCE LINES 106-120 .. code-block:: default print("Predicting people's names on the test set") t0 = time() y_pred = clf.predict(X_test_pca) print("done in %0.3fs" % (time() - t0)) print(classification_report(y_test, y_pred, target_names=target_names)) ConfusionMatrixDisplay.from_estimator( clf, X_test_pca, y_test, display_labels=target_names, xticks_rotation="vertical" ) plt.tight_layout() plt.show() .. GENERATED FROM PYTHON SOURCE LINES 121-122 Qualitative evaluation of the predictions using matplotlib .. GENERATED FROM PYTHON SOURCE LINES 122-136 .. code-block:: default def plot_gallery(images, titles, h, w, n_row=3, n_col=4): """Helper function to plot a gallery of portraits""" plt.figure(figsize=(1.8 * n_col, 2.4 * n_row)) plt.subplots_adjust(bottom=0, left=0.01, right=0.99, top=0.90, hspace=0.35) for i in range(n_row * n_col): plt.subplot(n_row, n_col, i + 1) plt.imshow(images[i].reshape((h, w)), cmap=plt.cm.gray) plt.title(titles[i], size=12) plt.xticks(()) plt.yticks(()) .. GENERATED FROM PYTHON SOURCE LINES 137-138 plot the result of the prediction on a portion of the test set .. GENERATED FROM PYTHON SOURCE LINES 138-151 .. code-block:: default def title(y_pred, y_test, target_names, i): pred_name = target_names[y_pred[i]].rsplit(" ", 1)[-1] true_name = target_names[y_test[i]].rsplit(" ", 1)[-1] return "predicted: %s\ntrue: %s" % (pred_name, true_name) prediction_titles = [ title(y_pred, y_test, target_names, i) for i in range(y_pred.shape[0]) ] plot_gallery(X_test, prediction_titles, h, w) .. GENERATED FROM PYTHON SOURCE LINES 152-153 plot the gallery of the most significative eigenfaces .. GENERATED FROM PYTHON SOURCE LINES 153-159 .. code-block:: default eigenface_titles = ["eigenface %d" % i for i in range(eigenfaces.shape[0])] plot_gallery(eigenfaces, eigenface_titles, h, w) plt.show() .. GENERATED FROM PYTHON SOURCE LINES 160-164 Face recognition problem would be much more effectively solved by training convolutional neural networks but this family of models is outside of the scope of the scikit-learn library. Interested readers should instead try to use pytorch or tensorflow to implement such models. .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.000 seconds) .. _sphx_glr_download_auto_examples_applications_plot_face_recognition.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/scikit-learn/scikit-learn/main?urlpath=lab/tree/notebooks/auto_examples/applications/plot_face_recognition.ipynb :alt: Launch binder :width: 150 px .. container:: lite-badge .. image:: images/jupyterlite_badge_logo.svg :target: ../../lite/lab/?path=auto_examples/applications/plot_face_recognition.ipynb :alt: Launch JupyterLite :width: 150 px .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_face_recognition.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_face_recognition.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_