chunk

method of dascore.core.spool.DataFrameSpool source

chunk(
    self ,
    overlap: int | float | str | numpy.datetime64 | pandas._libs.tslibs.timestamps.Timestamp | None[int, float, str, datetime64, Timestamp, None] = None,
    keep_partial: bool = False,
    snap_coords: bool = True,
    tolerance: float = 1.5,
    conflict: Literal[‘drop’, ‘raise’, ‘keep_first’] = raise,
    **kwargs ,
)-> ‘Self’

Chunk the data in the spool along specified dimensions.

Parameters

Parameter Description
overlap The amount of overlap between each segment, starting with the end of
first patch. Negative values can be used to create gaps.
keep_partial If True, keep the segments which are smaller than chunk size.
This often occurs because of data gaps or at end of chunks.
snap_coords If True, snap the coords on joined patches such that the spacing
remains constant.
tolerance The number of samples a block of data can be spaced and still be
considered contiguous.
conflict Indicates how to handle conflicts in attributes other than those
indicated by dim (eg tag, history, station, etc). If “drop” simply
drop conflicting attributes, or attributes not shared by all models.
If “raise” raise an
[AttributeMergeError](dascore.exceptions.AttributeMergeError] when
issues are encountered. If “keep_first”, just keep the first value
for each attribute.
kwargs kwargs are used to specify the dimension along which to chunk, eg:
time=10 chunks along the time axis in 10 second increments.

Examples

import dascore as dc
from dascore.units import s

spool = dc.get_example_spool("random_das")
# get spools with time duration of 10 seconds
time_chunked = spool.chunk(time=10, overlap=1)
# merge along time axis
time_merged = spool.chunk(time=...)