Cluster (spike_sort.cluster)

Module with clustering algorithms.

Utility functions

Spike sorting is usually done with the cluster() function which takes as an argument one of the clustering methods (as a string).

Others functions help to manipulate the results:

cluster(method, features, *args, **kwargs) Automatically cluster spikes using K means algorithm
split_cells(spt_dict, idx[, which]) return the spike times belonging to the cluster and the rest

Clustering methods

Several different clustering methods are defined in the module. Each method should take at least one argument – the features structure.

k_means_plus(*args, **kwargs) k means with smart initialization.
gmm(data[, k, cvtype]) Cluster based on gaussian mixture models
manual(data[, n_spikes]) Sort spikes manually by cluster cutting
none(data) Do nothing
k_means(features[, K]) Perform K means clustering

Reference

spike_sort.core.cluster.cluster(method, features, *args, **kwargs)

Automatically cluster spikes using K means algorithm

Parameters:

features : dict

spike features datastructure

n_clusters : int

number of clusters to identify

args, kwargs :

optional arguments that are passed to the clustering algorithm

Returns:

labels : array

array of cluster (unit) label - one for each cell

Examples

Create a sample feature dataset and use k-means clustering to find groups of spikes (units)

>>> import spike_sort
>>> import numpy as np
>>> np.random.seed(1234) #k_means uses random initialization
>>> features = {'data':np.array([[0.,0.],
...                              [0, 1.],
...                              [0, 0.9],
...                              [0.1,0]])}
>>> labels = spike_sort.cluster.cluster('k_means', features, 2)
>>> print labels
[0 1 1 0]
spike_sort.core.cluster.dist_euclidean(spike_waves1, spike_waves2=None)

Given spike_waves calculate pairwise Euclidean distance between them

spike_sort.core.cluster.gmm(data, k=2, cvtype='full')

Cluster based on gaussian mixture models

Parameters:

data : dict

features structure

k : int

number of clusters

Returns:

cl : int array

cluster indicies

Notes

This function requires scikits-learn

spike_sort.core.cluster.k_means(features, K=2)

Perform K means clustering

Parameters:

data : dict

data vectors (n,m) where n is the number of datapoints and m is the number of variables

K : int

number of distinct clusters to identify

Returns:

partition : array

vector of cluster labels (ints) for each datapoint from data

spike_sort.core.cluster.manual(data, n_spikes='all', *args, **kwargs)

Sort spikes manually by cluster cutting

Opens a new window in which you can draw cluster of arbitrary shape.

Notes

Only two first features are plotted

spike_sort.core.cluster.none(data)

Do nothing

spike_sort.core.cluster.split_cells(spt_dict, idx, which='all')

return the spike times belonging to the cluster and the rest