Ripser.py API Guide¶

ripser.
ripser
(X, maxdim=1, thresh=inf, coeff=2, distance_matrix=False, do_cocycles=False, metric='euclidean')[source]¶ Compute persistence diagrams for X data array. If X is not a distance matrix, it will be converted to a distance matrix using the chosen metric.
Parameters:  X: ndarray (n_samples, n_features)
A numpy array of either data or distance matrix. Can also be a sparse distance matrix of type scipy.sparse
 maxdim: int, optional, default 1
Maximum homology dimension computed. Will compute all dimensions lower than and equal to this value. For 1, H_0 and H_1 will be computed.
 thresh: float, default infinity
Maximum distances considered when constructing filtration. If infinity, compute the entire filtration.
 coeff: int prime, default 2
Compute homology with coefficients in the prime field Z/pZ for p=coeff.
 distance_matrix: bool
Indicator that X is a distance matrix, if not we compute a distance matrix from X using the chosen metric.
 do_cocycles: bool
Indicator of whether to compute cocycles, if so, we compute and store cocycles in the cocycles_ dictionary Rips member variable
 metric: string or callable
The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options specified in PAIRED_DISTANCES, including “euclidean”, “manhattan”, or “cosine”. Alternatively, if metric is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two arrays from X as input and return a value indicating the distance between them.
Returns:  A dictionary holding all of the results of the computation
 {‘dgms’: list (size maxdim) of ndarray (n_pairs, 2)
A list of persistence diagrams, one for each dimension less than maxdim. Each diagram is an ndarray of size (n_pairs, 2) with the first column representing the birth time and the second column representing the death time of each pair.
 ‘cocycles’: list (size maxdim)
A list of representative cocycles in each dimension. The list in each dimension is parallel to the diagram in that dimension.
 ‘num_edges’: int
The number of edges added during the computation
 ‘dm’: ndarray (n_samples, n_samples)
The distance matrix used in the computation
 }
Examples
from ripser import ripser, plot_dgms from sklearn import datasets data = datasets.make_circles(n_samples=110)[0] dgms = ripser(data)['dgms'] plot_dgms(dgms)

ripser.
plot_dgms
(diagrams, plot_only=None, title=None, xy_range=None, labels=None, colormap='default', size=20, ax_color=array([0., 0., 0.]), colors=None, diagonal=True, lifetime=False, legend=True, show=False)[source]¶ A helper function to plot persistence diagrams.
Parameters:  diagrams: ndarray (n_pairs, 2) or list of diagrams
A diagram or list of diagrams. If diagram is a list of diagrams, then plot all on the same plot using different colors.
 plot_only: list of numeric
If specified, an array of only the diagrams that should be plotted.
 title: string, default is None
If title is defined, add it as title of the plot.
 xy_range: list of numeric [xmin, xmax, ymin, ymax]
User provided range of axes. This is useful for comparing multiple persistence diagrams.
 labels: string or list of strings
Legend labels for each diagram. If none are specified, we use H_0, H_1, H_2,… by default.
 colormap: string, default is ‘default’
Any of matplotlib color palettes. Some options are ‘default’, ‘seaborn’, ‘sequential’. See all available styles with
import matplotlib as mpl print(mpl.styles.available)
 size: numeric, default is 20
Pixel size of each point plotted.
 ax_color: any valid matplotlib color type.
See [https://matplotlib.org/api/colors_api.html](https://matplotlib.org/api/colors_api.html) for complete API.
 diagonal: bool, default is True
Plot the diagonal x=y line.
 lifetime: bool, default is False. If True, diagonal is turned to False.
Plot life time of each point instead of birth and death. Essentially, visualize (x, yx).
 legend: bool, default is True
If true, show the legend.
 show: bool, default is False
Call plt.show() after plotting. If you are using self.plot() as part of a subplot, set show=False and call plt.show() only once at the end.

class
ripser.
Rips
(maxdim=1, thresh=inf, coeff=2, do_cocycles=False, verbose=True)[source]¶ sklearn style class wrapper for ripser and plot_dgms.
Parameters:  maxdim: int, optional, default 1
Maximum homology dimension computed. Will compute all dimensions lower than and equal to this value. For 1, H_0 and H_1 will be computed.
 thresh: float, default infinity
Maximum distances considered when constructing filtration. If infinity, compute the entire filtration.
 coeff: int prime, default 2
Compute homology with coefficients in the prime field Z/pZ for p=coeff.
 do_cocycles: bool
Indicator of whether to compute cocycles, if so, we compute and store cocycles in the cocycles_ dictionary Rips member variable
Examples
from ripser import Rips from sklearn import datasets data = datasets.make_circles(n_samples=110)[0] rips = Rips() rips.transform(data) rips.plot()
Attributes:  dgm_: list of ndarray, each shape (n_pairs, 2)
After transform, dgm_ contains computed persistence diagrams in each dimension

fit_transform
(X, distance_matrix=False, metric='euclidean')[source]¶ Compute persistence diagrams for X data array and return the diagrams.
Parameters:  X: ndarray (n_samples, n_features)
A numpy array of either data or distance matrix.
 distance_matrix: bool
Indicator that X is a distance matrix, if not we compute a distance matrix from X using the chosen metric.
 metric: string or callable
The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options specified in PAIRED_DISTANCES, including “euclidean”, “manhattan”, or “cosine”. Alternatively, if metric is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two arrays from X as input and return a value indicating the distance between them.
Returns:  dgms: list (size maxdim) of ndarray (n_pairs, 2)
A list of persistence diagrams, one for each dimension less than maxdim. Each diagram is an ndarray of size (n_pairs, 2) with the first column representing the birth time and the second column representing the death time of each pair.

plot
(diagrams=None, plot_only=None, title=None, xy_range=None, labels=None, colormap='default', size=20, ax_color=array([0., 0., 0.]), colors=None, diagonal=True, lifetime=False, legend=True, show=True)[source]¶ A helper function to plot persistence diagrams.
Parameters:  diagrams: ndarray (n_pairs, 2) or list of diagrams
A diagram or list of diagrams as returned from self.fit. If diagram is None, we use self.dgm_ for plotting. If diagram is a list of diagrams, then plot all on the same plot using different colors.
 plot_only: list of numeric
If specified, an array of only the diagrams that should be plotted.
 title: string, default is None
If title is defined, add it as title of the plot.
 xy_range: list of numeric [xmin, xmax, ymin, ymax]
User provided range of axes. This is useful for comparing multiple persistence diagrams.
 labels: string or list of strings
Legend labels for each diagram. If none are specified, we use H_0, H_1, H_2,… by default.
 colormap: string, default is ‘default’
Any of matplotlib color palettes. Some options are ‘default’, ‘seaborn’, ‘sequential’. See all available styles with
import matplotlib as mpl print(mpl.styles.available)
 size: numeric, default is 20
Pixel size of each point plotted.
 ax_color: any valid matplitlib color type.
See [https://matplotlib.org/api/colors_api.html](https://matplotlib.org/api/colors_api.html) for complete API.
 diagonal: bool, default is True
Plot the diagonal x=y line.
 lifetime: bool, default is False. If True, diagonal is turned to False.
Plot life time of each point instead of birth and death. Essentially, visualize (x, yx).
 legend: bool, default is True
If true, show the legend.
 show: bool, default is True
Call plt.show() after plotting. If you are using self.plot() as part of a subplot, set show=False and call plt.show() only once at the end.