Code covered by the BSD License

### Highlights from Efficient K-Means Clustering using JIT

4.0
4.0 | 11 ratings Rate this file 72 Downloads (last 30 days) File Size: 2.02 KB File ID: #19344

# Efficient K-Means Clustering using JIT

by

### Yi Cao (view profile)

27 Mar 2008 (Updated )

A simple but fast tool for K-means clustering

File Information
Description

This is a tool for K-means clustering. After trying several different ways to program, I got the conclusion that using simple loops to perform distance calculation and comparison is most efficient and accurate because of the JIT acceleration in MATLAB.

The code is very simple and well documented, hence is suitable for beginners to learn k-means clustering algorithm.

Numerical comparisons show that this tool could be several times faster than kmeans in Statistics Toolbox.

Acknowledgements

This file inspired Patch Color Selector.

MATLAB release MATLAB 7.5 (R2007b)
22 Feb 2015 Ningamma Husenappa

### Ningamma Husenappa (view profile)

provoid code

24 Nov 2014 Junqi WANG

### Junqi WANG (view profile)

12 Jul 2012 Nikolay S.

### Nikolay S. (view profile)

03 Jul 2012 leila

### leila (view profile)

Does the code support 3d data?

Comment only
29 Feb 2012 S.Karthi

### S.Karthi (view profile)

13 May 2011 Maxime

### Maxime (view profile)

Although not a perfect way to solve the above-mentioned issue, adding the following two lines after the update of the centroids solved the problem in my case:

idnan = find(isnan(c(:,1)));
c(idnan,:) = X(randi(n,length(idnan),1),:);

Comment only
13 May 2011 Maxime

### Maxime (view profile)

Pretty fast indeed!

However, the number of cluster is sometimes not respected. The algorithm yields a lower number of clusters, replacing additional centroid by NaN. This can be inconvenient.

06 Apr 2011 Tim Benham

### Tim Benham (view profile)

The function fails to terminate on some inputs. For example see http://snipt.org/wpkI

17 Aug 2010 Nandha

### Nandha (view profile)

08 Jul 2009 Edgar Kraft

### Edgar Kraft (view profile)

The code is very nice and well documented. In some cases, however, the clusters are not properly identified if no initial centroid vectors are provided. This could be improved by automatically trying a small number of different random initial guesses and chosing the configuration which yields the smallest sum of distance between points and centroids.

05 Apr 2009 V. Poor

### V. Poor (view profile)

16 Mar 2009 Mo Chen

### Mo Chen (view profile)

18 May 2008 nicola rebagliati

this stuff works and examples/comparisons are given