mod arrow_source

module arrow_source

ArrowMarkerSource: drives the analysis commands directly from in-memory Arrow record batches with no TSV round-trip.

Built from Arrow IPC bytes, the source synthesises the same TableHeader / groups / Marker shapes the path-based reader produces so the existing command logic does not have to care which source it received.

Enums

enum ArrowSourceError
EmptyInput
HeterogeneousSchemas
ShortSchema(usize)
IpcRead(String)

Traits implemented

impl std::fmt::Display for ArrowSourceError
impl std::error::Error for ArrowSourceError

Structs and Unions

struct ArrowMarkerSource

In-memory Arrow source. Holds Arc’d RecordBatches so cloning is cheap (parallel iteration shares the underlying buffers).

Implementations

impl ArrowMarkerSource

Functions

fn batches(&self) -> &[RecordBatch]

Borrow the underlying batches (e.g. for re-spilling to Parquet).

fn from_batches(batches: Vec<RecordBatch>, popmap: Option<&Popmap>, min_depth: u16) -> Result<Self, ArrowSourceError>

Build a source from already-decoded RecordBatches. All batches must share the same schema (id, sequence, ind1, …, indN).

fn from_ipc_bytes(bytes: &[u8], popmap: Option<&Popmap>, min_depth: u16) -> Result<Self, ArrowSourceError>

Decode Arrow IPC stream bytes into an in-memory source.

fn min_depth(&self) -> u16

Minimum depth threshold for filtering presence bits during iteration.

fn n_individuals(&self) -> u16

Number of individual depth columns.

fn n_markers(&self) -> u64

Number of markers across all batches (cached on the header).