SKL30-Clustering.tex


\documentclass[SKL-MASTER.tex]{subfiles}
\begin{document}
	\Large
\section*{Building Models with Distance Metrics}

This chapter will cover the following topics:
\begin{itemize}
\item Using KMeans to cluster data
\item Optimizing the number of centroids
\item Assessing cluster correctness
\item Using MiniBatch KMeans to handle more data
\item Quantizing an image with KMeans clustering
\item Finding the closest objects in the feature space
\item Probabilistic clustering with Gaussian Mixture Models
\item Using KMeans for outlier detection
\item Using k-NN for regression
\end{itemize}
\newpage
\subsection*{Introduction}
% %In this chapter, we'll cover clustering. 
Clustering is often grouped together with unsupervised
techniques. These techniques assume that we do not know the outcome variable. This leads
to ambiguity in outcomes and objectives in practice, but nevertheless, clustering can be useful.
As we'll see, we can use clustering to "localize" our estimates in a supervised setting. This is
perhaps why clustering is so effective; it can handle a wide range of situations, and often,
the results are for the lack of a better term, "sane".
%========================================================%
%% - Building Models with Distance Metrics
%% - Page 86
We'll walk through a wide variety of applications in this chapter; from image processing to
regression and outlier detection. Through these applications, we'll see that clustering can
often be viewed through a probabilistic or optimization lens. Different interpretations lead
to various trade-offs. We'll walk through how to fit the models here so that you have the
tools to try out many models when faced with a clustering problem.
\end{document}