import numpy as np
import dascore as dc
# Get an example patch and add artificial spikes
patch = dc.get_example_patch()
data = patch.data.copy()
data[10, 5] = 10 # Add a large spike
patch = patch.update(data=data)
# Apply hampel filter along time dimension with 0.2 unit window
filtered = patch.hampel_filter(time=0.2, threshold=3.5)
assert filtered.data.shape == patch.data.shape
# The spike should be reduced
assert abs(filtered.data[10, 5]) < abs(patch.data[10, 5])
# Apply filter along multiple dimensions using samples and
# default threshold.
filtered_2d = patch.hampel_filter(time=5, distance=5, samples=True)
assert filtered_2d.data.shape == patch.data.shape
# Use exact median calculations (slower, more accurate)
filtered_exact = patch.hampel_filter(
time=5, distance=5, samples=True, approximate=False
)hampel_filter
hampel_filter(
patch: Patch ,
threshold: float = 10.0,
samples: bool = False,
approximate: bool = True,
**kwargs ,
)-> ‘PatchType’
A Hampel filter implementation useful for removing spikes in data.
Parameters
| Parameter | Description |
|---|---|
| patch | Input patch. |
| threshold | Outlier threshold in MAD units. Default is 10.0. |
| samples | If True, values specified by kwargs are in samples not coordinate units. |
| approximate |
If True, use fast approximation algorithms for improved performance. This applies 1D median filters sequentially along each dimension instead of a true 2D median filter, providing a 3-4x speedup. The approximation is usually good enough for spike removal purposes. |
| **kwargs |
Used to specify the lengths of the filter in each dimension. Each selected dim must be evenly sampled and should represent a window with an odd number of samples. |
Warning
Selecting windows with many samples can be very slow. It is recommended window size in each dimension be <10 samples.
Returns
Patch with outliers replaced by local median.
When samples=False, even window lengths are bumped to the next odd value to ensure a clean median calculation. When samples=True, an even sample count raises a ParameterError.
Edge Handling: - Edge effects may differ slightly between modes due to different padding strategies based on the patch’s dimensionality and use of approximate parameter.
Performance: - approximate=True provides 3-4x speedup over exact calculations - Installing bottleneck package can further improve performance (~50%) which applies to both approximate and exact modes.