causallib.contrib.faissknn.FaissNearestNeighbors#

class FaissNearestNeighbors(metric='mahalanobis', index_type='flatl2', n_cells=100, n_probes=10)[source]#

NearestNeighbors object utilizing the faiss library for speed

Implements the same API as sklearn but runs 5-10x faster. Utilizes the faiss library facebookresearch/faiss . Tested with version 1.7.0. If faiss-gpu is installed from pypi, GPU acceleration will be used if available.

Parameters:
  • metric (str) – Distance metric for finding nearest neighbors (default: “mahalanobis”)

  • index_type (str) – Index type within faiss to use (supported: “flatl2” and “ivfflat”)

  • n_cells (int) – Number of voronoi cells (only used for “ivfflat”, default: 100)

  • n_probes (int) – Number of voronoi cells to search in (only used for “ivfflat”, default: 10)

Attributes (after running fit):

index_ : the faiss index fit from the data. For details about faiss indices, see the faiss documentation at facebookresearch/faiss .

__init__(metric='mahalanobis', index_type='flatl2', n_cells=100, n_probes=10)[source]#

NearestNeighbors object utilizing the faiss library for speed

Implements the same API as sklearn but runs 5-10x faster. Utilizes the faiss library facebookresearch/faiss . Tested with version 1.7.0. If faiss-gpu is installed from pypi, GPU acceleration will be used if available.

Parameters:
  • metric (str) – Distance metric for finding nearest neighbors (default: “mahalanobis”)

  • index_type (str) – Index type within faiss to use (supported: “flatl2” and “ivfflat”)

  • n_cells (int) – Number of voronoi cells (only used for “ivfflat”, default: 100)

  • n_probes (int) – Number of voronoi cells to search in (only used for “ivfflat”, default: 10)

Attributes (after running fit):

index_ : the faiss index fit from the data. For details about faiss indices, see the faiss documentation at facebookresearch/faiss .

fit(X)[source]#

Create faiss index and train with data.

Parameters:

X (numpy.ndarray) – Array of N samples of shape (NxM)

Returns:

Fitted object

Return type:

self

kneighbors(X, n_neighbors=1)[source]#

Find the k nearest neighbors of each sample in X

Parameters:
  • X (numpy.ndarray) – Array of shape (N,M) of samples to search for neighbors of. M must be the same as the fit data.

  • n_neighbors (int, optional) – Number of neighbors to find. Defaults to 1.

Returns:

Two numpy.ndarray objects of shape (N,n_neighbors)

containing the distances and indices of the closest neighbors.

Return type:

(distances, indices)

set_params(**parameters)[source]#
get_params(deep=True)[source]#