annbatch.ChunkSampler#

class annbatch.ChunkSampler(chunk_size, preload_nchunks, batch_size, *, mask=None, shuffle=False, drop_last=False, rng=None)#

Chunk-based sampler for batched data access.

Parameters:
batch_size int

Number of observations per batch.

chunk_size int

Size of each chunk i.e. the range of each chunk yielded.

mask slice | None (default: None)

A slice defining the observation range to sample from (start:stop).

shuffle bool (default: False)

Whether to shuffle chunk and index order.

preload_nchunks int

Number of chunks to load per iteration.

drop_last bool (default: False)

Whether to drop the last incomplete batch.

rng Generator | None (default: None)

Random number generator for shuffling.

Attributes table#

batch_size

The batch size for data loading.

shuffle

Whether data is shuffled.

Methods table#

sample(n_obs)

Sample load requests given the total number of observations.

validate(n_obs)

Validate the sampler configuration against the loader's n_obs.

Attributes#

ChunkSampler.batch_size#
ChunkSampler.shuffle#

Methods#

ChunkSampler.sample(n_obs)#

Sample load requests given the total number of observations.

Base implemention simply calls validate() and then yields via _sample().

Parameters:
n_obs int

The total number of observations available.

Yields:

LoadRequest – Load requests for batching data.

Return type:

Iterator[LoadRequest]

ChunkSampler.validate(n_obs)#

Validate the sampler configuration against the loader’s n_obs.

Parameters:
n_obs int

The total number of observations in the loader.

Raises:

ValueError – If the sampler configuration is invalid for the given n_obs.

Return type:

None