ndsampler.abstract_frames module¶
Fast access to subregions of images.
This implements the core convert-and-cache-as-cog logic, which enables us to read from subregions of images quickly.
Todo
[X] Implement npy memmap backend
- [X] Implement gdal COG.TIFF backend
[X] Use as COG if input file is a COG
[X] Convert to COG if needed
- class ndsampler.abstract_frames.Frames(hashid_mode='PATH', workdir=None, backend=None)[source]¶
Bases:
object
Abstract implementation of Frames.
While this is an abstract class, it contains most of the
Frames
functionality. The inheriting class needs to overload the constructor and_lookup_gpath
, which maps an image-id to its path on disk.- Parameters:
hashid_mode (str, default=’PATH’) – The method used to compute a unique identifier for every image. to can be PATH, PIXELS, or GIVEN. TODO: Add DVC as a method (where it uses the name of the symlink)?
workdir (PathLike) – This is the directory where Frames can store cached results. This SHOULD be specified.
backend (str | Dict) – Determine the backend to use for fast subimage region lookups. This can either be a string ‘cog’ or ‘npy’. This can also be a config dictionary for fine-grained backend control. For this case, ‘type’: specified cog or npy, and only COG has additional options which are:
- {
‘type’: ‘cog’, ‘config’: { ‘compress’: <’LZW’ | ‘JPEG | ‘DEFLATE’ | ‘ZSTD’ | ‘auto’>, }
}
Example
>>> from ndsampler.abstract_frames import * >>> self = SimpleFrames.demo(backend='npy') >>> file = self.load_image(1) >>> print('file = {!r}'.format(file)) >>> assert self.load_image(1).shape == (512, 512, 3) >>> assert self.load_region(1, (slice(-20), slice(-10))).shape == (492, 502, 3) >>> # xdoctest: +REQUIRES(module:osgeo) >>> self = SimpleFrames.demo(backend='cog') >>> assert self.load_image(1).shape == (512, 512, 3) >>> assert self.load_region(1, (slice(-20), slice(-10))).shape == (492, 502, 3)
Benchmark
>>> from ndsampler.abstract_frames import * # NOQA >>> import ubelt as ub >>> # >>> ti = ub.Timerit(100, bestof=3, verbose=2) >>> # >>> self = SimpleFrames.demo(backend='cog') >>> for timer in ti.reset('cog-small-subregion'): >>> self.load_image(1)[10:42, 10:42] >>> # >>> self = SimpleFrames.demo(backend='npy') >>> for timer in ti.reset('npy-small-subregion'): >>> self.load_image(1)[10:42, 10:42] >>> print('----') >>> # >>> self = SimpleFrames.demo(backend='cog') >>> for timer in ti.reset('cog-large-subregion'): >>> self.load_image(1)[3:-3, 3:-3] >>> # >>> self = SimpleFrames.demo(backend='npy') >>> for timer in ti.reset('npy-large-subregion'): >>> self.load_image(1)[3:-3, 3:-3] >>> print('----') >>> # >>> self = SimpleFrames.demo(backend='cog') >>> for timer in ti.reset('cog-loadimage'): >>> self.load_image(1) >>> # >>> self = SimpleFrames.demo(backend='npy') >>> for timer in ti.reset('npy-loadimage'): >>> self.load_image(1)
- DEFAULT_NPY_CONFIG = {'config': {}, 'type': 'npy'}¶
- DEFAULT_COG_CONFIG = {'_hack_use_cli': True, 'config': {'compress': 'auto'}, 'type': 'cog'}¶
- property cache_dpath¶
Returns the path where cached frame representations will be stored.
This will be None if there is no backend.
- property image_ids¶
- load_region(image_id, region=None, channels=NoParam, width=None, height=None)[source]¶
Ammortized O(1) image subregion loading (assuming constant region size)
if region size is varied, then sampling time scales with the number of tiles needed to overlap the requested region.
- Parameters:
image_id (int) – image identifier
region (Tuple[slice, …]) – space-time region within an image
channels (str) – NotImplemented
width (int) – if the width of the entire image is know specify it
height (int) – if the height of the entire image is know specify it
- load_image(image_id, channels=NoParam, cache=True, noreturn=False)[source]¶
Load the image data for a particular image id
- Parameters:
image_id (int) – the id of the image to load
cache (bool, default=True) – ensure and return the efficient backend cached representation.
channels – NotImplemented
noreturn (bool, default=False) – if True, nothing is returned. This is useful if you simply want to ensure the cached representation.
CAREFUL: THIS NEEDS TO MAINTAIN A STABLE API. OTHER PROJECTS DEPEND ON IT.
- Returns:
- an indexable array like representation, possibly
memmapped.
- Return type:
ArrayLike
- load_frame(image_id)[source]¶
TODO: FINISHME or rename to lazy frame?
Returns a frame object that lazy loads on slice
- prepare(gids=None, workers=0, use_stamp=True)[source]¶
Precompute the cached frame conversions
- Parameters:
gids (List[int] | None) – specific image ids to prepare. If None prepare all images.
workers (int, default=0) – number of parallel threads for this io-bound task
Example
>>> from ndsampler.abstract_frames import * >>> workdir = ub.Path.appdir('ndsampler/tests/test_cog_precomp').ensuredir() >>> print('workdir = {!r}'.format(workdir)) >>> ub.delete(workdir) >>> ub.ensuredir(workdir) >>> self = SimpleFrames.demo(backend='npy', workdir=workdir) >>> print('self = {!r}'.format(self)) >>> print('self.cache_dpath = {!r}'.format(self.cache_dpath)) >>> #_ = ub.cmd('tree ' + workdir, verbose=3) >>> self.prepare() >>> self.prepare() >>> #_ = ub.cmd('tree ' + workdir, verbose=3) >>> _ = ub.cmd('ls ' + self.cache_dpath, verbose=3)
Example
>>> from ndsampler.abstract_frames import * >>> import ndsampler >>> workdir = ub.Path.appdir('ndsampler/tests/test_cog_precomp2') >>> workdir.delete() >>> # TEST NPY >>> # >>> sampler = ndsampler.CocoSampler.demo(workdir=workdir, backend='npy') >>> self = sampler.frames >>> ub.delete(self.cache_dpath) # reset >>> self.prepare() # serial, miss >>> self.prepare() # serial, hit >>> ub.delete(self.cache_dpath) # reset >>> self.prepare(workers=3) # parallel, miss >>> self.prepare(workers=3) # parallel, hit >>> # >>> ## TEST COG >>> # xdoctest: +REQUIRES(module:osgeo) >>> sampler = ndsampler.CocoSampler.demo(workdir=workdir, backend='cog') >>> self = sampler.frames >>> ub.delete(self.cache_dpath) # reset >>> self.prepare() # serial, miss >>> self.prepare() # serial, hit >>> ub.delete(self.cache_dpath) # reset >>> self.prepare(workers=3) # parallel, miss >>> self.prepare(workers=3) # parallel, hit
- class ndsampler.abstract_frames.SimpleFrames(id_to_path, workdir=None, backend=None)[source]¶
Bases:
Frames
Basic concrete implementation of frames objects for images where there is a strict one-file-to-one-image mapping (i.e. no auxiliary images).
- Parameters:
id_to_path (Dict) – mapping from image-id to image path
Example
>>> from ndsampler.abstract_frames import * >>> self = SimpleFrames.demo(backend='npy') >>> pathinfo = self._build_pathinfo(1) >>> print('pathinfo = {}'.format(ub.urepr(pathinfo, nl=3)))
>>> assert self.load_image(1).shape == (512, 512, 3) >>> assert self.load_region(1, (slice(-20), slice(-10))).shape == (492, 502, 3)
- property image_ids¶
- class ndsampler.abstract_frames.AlignableImageData(pathinfo, cache_backend)[source]¶
Bases:
object
Class for sampling channels / frames that are aligned with each other
Todo
[ ] This is more general than the older way of accessing image data
however, there is a lot more logic that hasn’t been profiled, so we may be able to find meaningful optimizations.
[ ] Make sure adding this didnt significantly hurt performance
[ ] DEPRECATE THIS IN FAVOR OF NEW KWCOCO DELAYED LOGIC
Example
>>> from ndsampler.abstract_frames import * >>> frames = SimpleFrames.demo(backend='npy') >>> pathinfo = frames._build_pathinfo(1) >>> cache_backend = frames._backend >>> print('pathinfo = {}'.format(ub.urepr(pathinfo, nl=3))) >>> self = AlignableImageData(pathinfo, cache_backend) >>> img_region = None >>> prefused = self._load_prefused_region(img_region) >>> print('prefused = {!r}'.format(prefused)) >>> img_region = (slice(0, 10), slice(0, 10)) >>> prefused = self._load_prefused_region(img_region) >>> print('prefused = {!r}'.format(prefused))