Further Topics¶
Optimizing GPU-Usage¶
If a GPU is used (kmeanstf does work also without one, just slower) certain optimizations are possible depending on the size of the GPU memory. These are described below.
Both KMeansTF
and TunnelKMeansTF
have a parameter max_mem
.
This parameter is used to decide if for a given k-means problem characterized by data set X[n,d] and number of
centroids (parameter cluster_centers, a.k.a. k) the central distance computation (distance of all centroids to all data points) can be
performed in one step or has to be done with subsets of the data. For the default data type tf.float32
(4 bytes) this distance computation requires a matrix of size
Therefore, if
the data set is split into sufficiently many parts that the above matrix for each part has at most size max_mem
and
the distances for all parts are computed separately.
In case of a
GPU memory error (which may occur depending on the combination of max_mem
value and actual size of the GPU memory) the value of max_mem
is
reduced by a specific fraction (currently 10%) until a suitable value is found.
This default value (1_800_000_000) for max_mem
has been empirically selected for a particular graphics card (NVidia GTX-1060 6GB).
It is considerably lower than the available GPU memory (6GB in this case) since also for other computations
some GPU-memory is required.
For a much larger or smaller GPU-memory it might make sense to provide a non-default value for the max_mem
parameter. By setting the parameter verbose
to 1 for
KMeansTF
or TunnelKMeansTF
one can observe messages about splitting the data set or about GPU-errors and can find a suitable value for the graphics card at hand.
Differences between kmeanstf and scikit-learns’s KMeans class¶
The following performance-related functionalities of scikit-learn’s
KMeans class
were not implemented in KMeansTF
and TunnelKMeansTF
since kmeanstf provides data parallelism via GPU usage:
- elkan algorithm (parameter algorithm, value ‘elkan’)
- support for job-level parallelism (parameter n_jobs)
- precomputing distances (parameters precompute_distances, copy_x)