import dascore as dc
from dascore import print
= dc.get_example_patch("random_das")
pa1 = dc.get_example_patch("example_event_1") pa2
Patch
A Patch
manages data and its associated coordinates and metadata.
The Patch
design was inspired by Xarray’s DataArray
object
Creating Patches
Patches can be created in a few ways.
Load an Example Patch
DASCore includes several example datasets. They are mostly used for simple demonstrations and testing.
Load a File
We first download a small example fiber file from a URL in the DASCore library (you need an internet connection). Next, we read it into a spool object then get the first (and only) patch. Spools are covered in more detail in the next section.
import dascore as dc
from dascore.utils.downloader import fetch
= fetch("terra15_das_1_trimmed.hdf5") # path to a datafile
path
= dc.spool(path)[0] pa
Create a Patch from Scratch
Patches can be created from using:
- A data array
- Coordinates for labeling each axis
- Attributes (optional)
import numpy as np
import dascore as dc
from dascore.utils.time import to_timedelta64
# Create the patch data
= np.random.random(size=(300, 2_000))
array
# Create attributes, or metadata
= dc.to_datetime64("2017-09-18")
t1 = dict(
attrs =1,
d_distance=to_timedelta64(1 / 250),
d_time="DAS",
categoryid="test_data1",
=t1,
time_min="um/(m * s)"
data_units
)
# Create coordinates, labels for each axis in the array.
= dict(
coords =np.arange(array.shape[0]) * attrs["d_distance"],
distance=t1 + np.arange(array.shape[1]) * attrs["d_time"],
time
)
# define dimensions (first label corresponds to data axis 0)
= ('distance', 'time')
dims
= dc.Patch(data=array, coords=coords, attrs=attrs, dims=dims) pa
Patch Anatomy
Data
The data is simply an n-dimensional array which is accessed with the data
attribute.
import dascore as dc
= dc.get_example_patch()
patch
print(f"Data shape is {patch.data.shape}")
print(f"Data contents are\n{patch.data}")
Data shape is (300, 2000)
Data contents are [[0.77770241 0.23754122 0.82427853 ... 0.36950848 0.07650396 0.23197621] [0.49689594 0.44224037 0.70329426 ... 0.12617754 0.11760625 0.78003741] [0.20681917 0.19516906 0.17434521 ... 0.84933595 0.36479426 0.80740811] ... [0.61877586 0.1053084 0.66896335 ... 0.621027 0.43559346 0.49975826] [0.75717115 0.25935121 0.09051709 ... 0.36099578 0.9365496 0.10351814] [0.15780837 0.29487104 0.58475197 ... 0.22898748 0.23950251 0.49439913]]
The data arrays are should be read-only. This means you can’t modify them, but should first make a copy.
import numpy as np
10] = 12 # wont work
patch.data[:
= np.array(patch.data) # this makes a copy
array 10] = 12 # then this works array[:
Coords
DASCore implements a special class, called a CoordinateManager, which managers dimension names, coordinate labels, etc. This class behaves like a dict, so coordinate arrays are easily accessed via their names.
import dascore as dc
= dc.get_example_patch()
patch
= patch.coords
coords
# get time array
= coords['time']
time
# get distance array
= coords['distance'] distance
Coords also have a useful string representation:
print(coords)
➤ Coordinates (distance: 300, time: 2000) * distance: CoordRange( min: 0 max: 299 step: 1 shape: (300,) dtype: int64 units: m ) * time: CoordRange( min: 2017-09-18 max: 2017-09-18T00:00:07.996 step: 0.004s shape: (2000,) dtype: datetime64[ns] units: s )
You can read more about the coordinate manager in its doc page.
Attrs
The metadata stored in Patch.attrs
is a pydnatic model which enforces some basic schema validation. You can print the schema info like this:
import dascore as dc
print(dc.PatchAttrs.__doc__)
There may also be other attributes added by specific fiber formats.
String Rep.
DASCore Patches have as useful string representation:
import dascore as dc
= dc.get_example_patch()
patch print(patch)
DASCore Patch ⚡ --------------- ➤ Coordinates (distance: 300, time: 2000) * distance: CoordRange( min: 0 max: 299 step: 1 shape: (300,) dtype: int64 units: m ) * time: CoordRange( min: 2017-09-18 max: 2017-09-18T00:00:07.996 step: 0.004s shape: (2000,) dtype: datetime64[ns] units: s ) ➤ Data (float64) [[0.778 0.238 0.824 ... 0.37 0.077 0.232] [0.497 0.442 0.703 ... 0.126 0.118 0.78 ] [0.207 0.195 0.174 ... 0.849 0.365 0.807] ... [0.619 0.105 0.669 ... 0.621 0.436 0.5 ] [0.757 0.259 0.091 ... 0.361 0.937 0.104] [0.158 0.295 0.585 ... 0.229 0.24 0.494]] ➤ Attributes tag: random category: DAS
For various reasons, Patches should be treated as immutable, meaning they should not be modified in place, but rather new patches are created when something needs to be modified.
Selecting (trimming)
Patches are trimmed using the select
method. select
takes the coordinate name and a tuple of (lower_limit, upper_limit) as the values. Either limit can be None
or ...
indicating an open interval.
import numpy as np
import dascore as dc
= dc.get_example_patch()
patch = patch.attrs
attrs
# select 1 sec after current start time to 1 sec before end time.
= dc.to_timedelta64(1)
one_sec = (attrs.time_min + one_sec, attrs.time_max - one_sec)
select_tuple = patch.select(time=select_tuple)
new
# select only the first half of the distance channels.
= np.mean(patch.coords['distance'])
distance_max = patch.select(distance=(..., distance_max)) new
The “relative” keyword is used to trim coordinates based on start (positive) to end (negative)
import dascore as dc
from dascore.units import ft
= dc.get_example_patch()
patch
# We can make the example above simpler with relative selection
= patch.select(time=(1, -1), relative=True)
new
# select 2 seconds from end to 1 second from end
= patch.select(time=(-2, -1), relative=True)
new
# select last 100 ft of distance channels
= patch.select(distance=(..., -100 * ft), relative=True) new
iselect
provides the same functionality, but for index-based trimming.
import dascore as dc
= dc.get_example_patch()
patch
# Trim patch to only include first 10 time rows (or columns)
= patch.iselect(time=(..., 10))
new
# get the last dimension column/row
= patch.iselect(distance=-1) new
Processing
The patch has several methods which are intended to be chained together via a fluent interface, meaning each method returns a new Patch
instance.
import dascore as dc
= dc.get_example_patch()
pa
= (
out =8) # decimate to reduce data volume by 8 along time dimension
pa.decimate(time='distance') # detrend along distance dimension
.detrend(dim=(..., 10)) # apply a low-pass 10 Hz butterworth filter
.pass_filter(time )
The processing methods are located in the dascore.proc module. The patch processing tutorial provides more information about various processing routines.
Visualization
DASCore provides various visualization functions found in the dascore.viz package or using the Patch.viz
namespace. DASCore generally only implements simple, matplotlib based visualizations but other DASDAE packages will do more interesting visualizations.
import dascore as dc
= (
patch 'example_event_1')
dc.get_example_patch(=0.05)
.taper(time=(None, 300))
.pass_filter(time
)
=True, scale=0.2); patch.viz.waterfall(show
Modifying Patches
Because patches should be treated as immutable objects, you can’t just modify them with normal item assignment. There are a few methods that return new patches with modifications, however, that are functionally the same.
New
Often you may wish to modify one aspect of the patch. Patch.new
is designed for this purpose:
import dascore as dc
= dc.get_example_patch()
pa
# create a copy of patch with new data but coords and attrs stay the same
= pa.new(data=pa.data * 10) new
Update Attrs
Patch.update_attrs
is for making small changes to the patch attrs (metadata) while keeping the unaffected metadata (Patch.new
would require you replace the entirety of attrs).
import dascore as dc
= dc.get_example_patch()
pa
# update existing attribute 'network' and create new attr 'new_attr'
= pa.update_attrs(**{'network': 'exp1', 'new_attr': 42}) pa1
Patch.update_attrs
also tries to keep the patch attributes consistent. For example, changing the start, end, or sampling of a dimension should update the other attributes affected by the change.
import dascore as dc
= dc.get_example_patch()
pa
# update start time should also shift endtime
= pa.update_attrs(time_min='2000-01-01')
pa1
print(pa.attrs['time_min'])
print(pa1.attrs['time_min'])
2017-09-18T00:00:00.000000000
2000-01-01T00:00:00.000000000
Method Chaining
In most cases, you should use method chaining as part of a fluent interface when working with patches.
For example:
import dascore as dc
= (
pa # load the patch
dc.get_example_patch() =(1, 10) # apply bandpass filter
.pass_filter(time='time') # detrend along time dimension
.detrend(dim )
Similar to Pandas, Patch
has a pipe method so non-patch methods can still be used in a method chain.
import dascore as dc
def func(patch, arg1=1):
"""Example non-patch method"""
return patch.update_attrs(arg1=1)
= (
pa
dc.get_example_patch()=(..., 10))
.pass_filter(time'time', 'linear')
.detrend(=3)
.pipe(func, arg1 )
Adding Coordinates
It is common to have additional coordinates, such as latitude/longitude, attached to a particular dimension (e.g., distance). There are two ways to add coordinates to a patch:
Update Coordinates
The update_coords method will return a new patch with the coordinate added, if it didn’t exist in the original, or replaced, if it did.
import numpy as np
import dascore as dc
= dc.get_example_patch()
pa = pa.coords
coords = coords['distance']
dist = coords['time']
time
# Add a single coordinate associated with distance dimension
= np.arange(0, len(dist)) * .001 -109.857952
lat = pa.update_coords(latitude=('distance', lat))
out_1
# Add multiple coordinates associated with distance dimension
= np.arange(0, len(dist)) *.001 + 41.544654
lon = pa.update_coords(
out_2 =('distance', lat),
latitude=('distance', lon),
longitude
)
# Add multi-dimensional coordinates
= np.ones_like(pa.data)
quality = pa.update_coords(
out_3 =(pa.dims, quality)
quality )
Coords in Patch Initialization
Any number of coordinates can also be assigned when the patch is initiated. For coordinates other than those of the patch dimensions, the associated dimensions must be specified. For example:
import dascore as dc
import numpy as np
# create data for patch
= np.random.RandomState(13)
rand = rand.random(size=(20, 100))
array = np.datetime64("2020-01-01")
time1
# create patch attrs
= dict(dx=1, d_time=1 / 250.0, category="DAS", id="test_data1")
attrs = dc.to_timedelta64(np.arange(array.shape[1]) * attrs["d_time"])
time_deltas
# create coordinate data
= np.arange(array.shape[0]) * attrs["dx"]
distance = time1 + time_deltas
time = np.ones_like(array)
quality = np.arange(array.shape[0]) * .001 - 111.00
latitude
# create coord dict
= dict(
coords =distance,
distance=time,
time=("distance", latitude), # Note distance is attached dimension
latitude=(("distance", "time"), quality), # Two attached dimensions here
quality
)
# Define dimensions of array and init Patch
= ("distance", "time")
dims = dc.Patch(data=array, coords=coords, attrs=attrs, dims=dims) out
Units
As mentioned in the units section of the concept page, DASCore provides first-class support for units. Here are a few examples:
Patch units
There are two methods for configuring the units associated with a Patch
.
Patch.set_units
sets the units on a patch or its coordinates. Old units are simply overwritten without performing any conversions. The first argument sets the data units and the keywords set the coordinate units.
Patch.convert_units
converts the existing units of data or coordinates by appropriately transforming the data or coordinates arrays. If no units exist they will be set.
import dascore as dc
= dc.get_example_patch()
patch
# Set data units and distance units; don't do any conversions
= patch.set_units("m/s", distance="ft")
patch_set_units
# Convert data units and distance units; will modify data/coords
# to correctly do the conversion.
= patch_set_units.convert_units("ft/s", distance='m') patch_conv_units
The data or coordinate units attributes are Pint Quantity, but they can be converted to strings with get_quantity_str
.
import dascore as dc
from dascore.units import get_quantity_str
= dc.get_example_patch().set_units("m/s")
patch
print(type(patch.attrs.data_units))
print(get_quantity_str(patch.attrs.data_units))
<class 'pint.Quantity'>
m / s
Units in processing functions
import dascore as dc
from dascore.units import m, ft
= dc.get_example_patch()
pa
# sub-select a patch to only include distance from 10ft to 10m.
= pa.select(distance=(10*ft, 10*m))
sub_selected
# filter patch for spatial wavelengths from 10m to 100m
= pa.pass_filter(distance=(10*m, 100*m)) dist_filtered
See the documentation on Patch.select
and Patch.pass_filter
for more details.
Patch Operations
Patches implement common operators which means that many ufunc type operations can be applied directly on a patch with built-in python operators.
In the case of scalars and numpy arrays, the operations are broadcast over the patch data. In the case of two patches, compatibility between patches are first checked, the intersection of the coords and attrs are calculated, then the operator is applied to both patchs’ data. Here are a few examples:
Patch operations with scalars
import numpy as np
import dascore as dc
= dc.get_example_patch()
patch
= patch / 10
out1 assert np.allclose(patch.data / 10, out1.data)
= patch ** 2.3
out2 assert np.allclose(patch.data ** 2.3, out2.data)
= patch - 3
out3 assert np.allclose(patch.data - 3, out3.data)
Units are also fully supported.
import dascore as dc
from dascore.units import m, s
= dc.get_example_patch().set_units("m/s")
patch
# multiplying patches by a quantity with units updates the data_units attribute.
= patch * 10 * m/s
new
print(f"units before operation {patch.attrs.data_units}")
print(f"units after operation {new.attrs.data_units}")
units before operation 1.0 m / s
units after operation 1.0 m ** 2 / s ** 2
Patch operations with numpy arrays
import numpy as np
import dascore as dc
= dc.get_example_patch()
patch = np.ones(patch.shape)
ones
= patch + ones
out1 assert np.allclose(patch.data + ones, out1.data)
Units also work with numpy arrays.
import numpy as np
import dascore as dc
from dascore.units import furlongs
= dc.get_example_patch()
patch = np.ones(patch.shape) * furlongs
ones
= patch * ones
out1 print(f"units before operation {patch.attrs.data_units}")
print(f"units after operation {out1.attrs.data_units}")
units before operation None
units after operation 1 fur
Patch operations with other patches
import numpy as np
import dascore as dc
from dascore.units import furlongs
= dc.get_example_patch()
patch
# adding two patches together simply adds their data and checks/merges their
# coords and attrs.
= patch + patch
out
assert np.allclose(patch.data * 2, out.data)
See merge_compatible_coords_attrs
for more details on how attributes and coordinates are handled when performing operations on two patches.