Exact segmentation: dynamic programming¶
Description¶
The method is implemented in ruptures.detection.Dynp
.
Roughly speaking, it computes the cost of all subsequences of a given signal. The number of computed costs is of the order \(\mathcal{O}(Kn^2)\), where \(K\) is the number of change points and \(n\) the number of samples. This has to be multiplied by the computational cost of computing the approximation error on one sub-sequence. Consequently, piecewise constant models are significantly faster than linear or autoregressive models.
Computational cost is drastically reduced when considering only a subsample of possible change
points.
When calling ruptures.detection.Dynp.__init__()
, the minimum distance between change points
can be set through the keyword 'min_size'
; through the parameter 'jump'
, only change
point indexes multiple of a particular value are considered.
Usage¶
import numpy as np
import matplotlib.pylab as plt
import ruptures as rpt
# creation of data
n, dim = 500, 3
n_bkps, sigma = 3, 5
signal, bkps = rpt.pw_constant(n, dim, n_bkps, noise_std=sigma)
# change point detection
model = "l1" # "l2", "rbf"
algo = rpt.Dynp(model=model, min_size=3, jump=5).fit(signal)
my_bkps = algo.predict(n_bkps=3)
# show results
rpt.show.display(signal, bkps, my_bkps, figsize=(10, 6))
plt.show()
Code explanation¶
-
class
ruptures.detection.
Dynp
(model='l2', custom_cost=None, min_size=2, jump=5, params=None)[source]¶ Find optimal change points using dynamic programming.
Given a segment model, it computes the best partition for which the sum of errors is minimum.
-
__init__
(model='l2', custom_cost=None, min_size=2, jump=5, params=None)[source]¶ Creates a Dynp instance.
- Parameters
model (str, optional) – segment model, [“l1”, “l2”, “rbf”]. Not used if
'custom_cost'
is not None.custom_cost (BaseCost, optional) – custom cost function. Defaults to None.
min_size (int, optional) – minimum segment length.
jump (int, optional) – subsample (one every jump points).
params (dict, optional) – a dictionary of parameters for the cost instance.
- Returns
self
-
fit
(signal)[source]¶ Create the cache associated with the signal.
Dynamic programming is a recurrence; intermediate results are cached to speed up computations. This method sets up the cache.
- Parameters
signal (array) – signal. Shape (n_samples, n_features) or (n_samples,).
- Returns
self
-
fit_predict
(signal, n_bkps)[source]¶ Fit to the signal and return the optimal breakpoints.
Helper method to call fit and predict once
- Parameters
signal (array) – signal. Shape (n_samples, n_features) or (n_samples,).
n_bkps (int) – number of breakpoints.
- Returns
sorted list of breakpoints
- Return type
list
-