NAME
    Data::RoaringBitmap::Shared - shared-memory Roaring bitmap (compressed
    uint32 set) for Linux

SYNOPSIS
        use Data::RoaringBitmap::Shared;

        # anonymous shared mapping: a 256-slot container pool
        my $a = Data::RoaringBitmap::Shared->new(undef, 256);

        $a->add(5);              # 1 (newly added)
        $a->add(5);              # 0 (already present)
        $a->contains(5);         # true   ('test' is an alias)
        $a->remove(5);           # 1 (removed; 'delete' is an alias)

        $a->add_many([1, 2, 3, 1000, 70000]);   # returns count newly added
        $a->cardinality;         # number of elements  ('count'/'size' are aliases)
        $a->min; $a->max;        # smallest / largest element (undef if empty)
        my $list = $a->to_array; # arrayref of every element, ascending

        # in-place set operations against another bitmap
        my $b = Data::RoaringBitmap::Shared->new(undef, 256);
        $b->add_many([3, 4, 5]);
        $a->union($b);           # a |= b   ('or' is an alias);  returns $a
        $a->intersect($b);       # a &= b   ('and' is an alias);  returns $a

        # share across processes via a backing file
        my $shared = Data::RoaringBitmap::Shared->new("/tmp/ids.bin", 65536);

DESCRIPTION
    A Roaring bitmap in shared memory: a compressed set of 32-bit unsigned
    integers. The 32-bit value space is split into 65536 buckets keyed by
    the high 16 bits of each value; each bucket stores the low 16 bits of
    its members in one of two container kinds, chosen automatically by
    density:

    *   an array container -- a sorted ascending "uint16" array, compact
        when the bucket is sparse (up to 4096 elements);

    *   a bitmap container -- a 65536-bit bitmap, efficient when the bucket
        is dense.

    A bucket starts as an array and is promoted to a bitmap once it would
    exceed 4096 elements. Both kinds occupy one fixed 8192-byte slot from a
    container pool, so the structure stays compact for sparse, clustered,
    and dense sets alike. It supports membership tests, cardinality,
    "min"/"max", "to_array", and in-place "union" and "intersect" with
    another bitmap.

    Because the bucket table and container pool live in a shared mapping,
    several processes share one bitmap: any process that opens the same
    backing file, inherits the anonymous mapping across "fork", or reopens a
    passed memfd, sees the others' elements. A write-preferring futex rwlock
    with dead-process recovery guards every mutation, so concurrent adds,
    removals, and set operations serialize cleanly.

    Values must fit in 32 bits (0 .. 4294967295); a value outside that range
    croaks on "add"/"add_many" and is simply absent for "contains"/"remove".

    v1 scope: array + bitmap containers (no run containers); union and
    intersect only (no "xor" / "andnot" yet); a bitmap container is not
    down-converted back to an array when removals make it sparse again.

    Capacity is fixed at creation. The container pool holds
    "container_capacity" slots; "add", "add_many" and "union" croak (after
    releasing the lock) when the pool is exhausted. A bitmap that touches
    every bucket as a dense bitmap needs all 65536 slots (512 MiB of pool);
    a sparse set needs far fewer. Linux-only. Requires 64-bit Perl.

METHODS
  Constructors
        my $a = Data::RoaringBitmap::Shared->new($path, $container_capacity);
        my $a = Data::RoaringBitmap::Shared->new(undef, $container_capacity); # anonymous
        my $a = Data::RoaringBitmap::Shared->new_memfd($name, $container_capacity);
        my $a = Data::RoaringBitmap::Shared->new_from_fd($fd);

    $path is the backing file ("undef" or omitted for an anonymous mapping).
    $container_capacity (default 256) is the number of 8192-byte container
    slots the pool holds; it must be ">= 1" and "<= 2**20". One slot is
    consumed per non-empty bucket (a bucket is one container regardless of
    whether it is an array or a bitmap), so size the pool for the number of
    distinct high-16 groups your values fall into, with headroom. "new" and
    "new_memfd" croak if the capacity is out of range. A freshly created
    bitmap is empty ("cardinality == 0").

    When reopening an existing file or memfd, the stored geometry wins and
    the existing elements are preserved; the capacity you pass to "new" on a
    reopen is used only when the file is brand new. "new_memfd" creates a
    Linux memfd (transferable via its "memfd" descriptor); "new_from_fd"
    reopens one in another process.

  Adding and removing
        my $new   = $a->add($x);          # 1 if newly added, 0 if already present
        my $new   = $a->set($x);          # alias for add
        my $added = $a->add_many(\@ints); # count of elements newly added
        my $gone  = $a->remove($x);       # 1 if removed, 0 if absent
        my $gone  = $a->delete($x);       # alias for remove

    "add" (aliased "set") inserts $x, returning 1 if it was new or 0 if it
    was already present. $x must be an unsigned integer that fits in 32
    bits; a larger value croaks before any lock is taken. "add" croaks if
    the container pool is exhausted (only adding to a brand-new bucket can
    need a slot); the croak happens after the write lock is released and the
    bitmap is left consistent.

    "add_many" adds every value of the array reference (each range-checked
    up front; an out-of-range value croaks before any lock). It returns the
    number of elements that were newly added. It is not atomic on pool
    exhaustion: values are added in order, and if the pool runs out partway,
    "add_many" croaks with the already-processed elements left as members (a
    set is order-independent, so the added ones are simply present).

    "remove" (aliased "delete") removes $x, returning 1 if it was present or
    0 if it was absent (an out-of-range value is treated as absent). When
    the last element of a bucket is removed, the bucket's container slot is
    returned to the pool. As noted in "DESCRIPTION", a bitmap container is
    not converted back to an array on removal in v1.

  Membership, cardinality, ordering
        my $bool = $a->contains($x);   # true if $x is a member
        my $bool = $a->test($x);       # alias for contains
        my $n    = $a->cardinality;    # number of elements
        my $n    = $a->count;          # alias for cardinality
        my $n    = $a->size;           # alias for cardinality
        my $bool = $a->is_empty;       # true when cardinality == 0
        my $lo   = $a->min;            # smallest element, or undef if empty
        my $hi   = $a->max;            # largest element, or undef if empty
        my $list = $a->to_array;       # arrayref of all elements, ascending

    "contains" (aliased "test") is a read; an out-of-range value is simply
    not a member. "cardinality" (aliased "count"/"size") is the total number
    of elements across all buckets. "min" and "max" scan for the
    smallest/largest member and return "undef" on an empty set. "to_array"
    returns a reference to a new array holding every element in ascending
    order; for a large dense set this can be a long list. (The array is
    pre-sized under a brief read lock and filled under the read lock; it is
    a best-effort snapshot if the set is being mutated concurrently.)

  Set operations (in place)
        $a->union($other);       # a |= b  (set union);        returns $a
        $a->or($other);          # alias for union
        $a->intersect($other);   # a &= b  (set intersection);  returns $a
        $a->and($other);         # alias for intersect

    "union" (aliased "or") adds every element of $other to the receiver;
    "intersect" (aliased "and") keeps only the elements present in both.
    Both modify the receiver in place and return it (so calls chain). $other
    must be a "Data::RoaringBitmap::Shared" object; combining a bitmap with
    itself ("$a->union($a)") is a no-op.

    "union" may need new container slots (one per bucket that $other
    occupies and the receiver does not); it pre-checks capacity and croaks
    before mutating (after releasing the locks) if the pool cannot satisfy
    the operation, leaving the receiver unchanged. "intersect" only ever
    shrinks or frees containers, so it never needs new slots and never
    croaks for capacity.

    Locking note: a set operation takes the receiver's write lock and
    $other's read lock. To stay deadlock-free even if two processes run
    "$a->union($b)" and "$b->union($a)" simultaneously, the two locks are
    acquired in a fixed order keyed on a per-bitmap identity stored in
    shared memory (so both processes compute the same order regardless of
    how the handles happen to be laid out in each process), and the receiver
    always takes the write lock. Calling a set op with two handles to the
    same underlying bitmap (the same object, or a second handle that
    reopened the same backing file or memfd) is detected and is a no-op.

  Lifecycle
        $a->clear;                                  # remove every element
        $a->path; $a->memfd; $a->sync; $a->unlink;  # or Class->unlink($path)

    "clear" empties the bitmap and returns every container slot to the pool.
    "sync" flushes the mapping to its backing store (a no-op for anonymous
    and memfd bitmaps); "unlink" removes the backing file (also callable as
    "Class->unlink($path)"); "path" returns the backing path ("undef" for
    anonymous, memfd, or fd-reopened bitmaps) and "memfd" the backing
    descriptor -- the memfd of a "new_memfd" bitmap or the dup'd fd of a
    "new_from_fd" bitmap, and -1 for file-backed or anonymous bitmaps.

STATS
    stats() returns a hashref describing the bitmap:

    *   "cardinality" -- the total number of elements (same as
        "cardinality").

    *   "containers_used" -- the 1-based high-water of container slots
        allocated since creation or the last "clear" (slot 0 is the reserved
        NULL sentinel, so an empty pool reports 1).

    *   "containers_capacity" -- the fixed container-pool capacity.

    *   "buckets_used" -- the number of non-empty buckets (each is one
        container, array or bitmap).

    *   "ops" -- running count of mutating operations ("add", "add_many",
        "remove", "union", "intersect", "clear").

    *   "mmap_size" -- bytes of the shared mapping.

SHARING ACROSS PROCESSES
    The bitmap lives in a shared mapping, shared the same three ways as the
    rest of the family: a backing file (every process calls "new($path,
    ...)" on the same path), an anonymous mapping inherited across "fork",
    or a memfd whose descriptor is passed to an unrelated process (over a
    UNIX socket via "SCM_RIGHTS", or via "/proc/$pid/fd/$n") and reopened
    with new_from_fd($fd). Because the mapping is shared, every process adds
    to and queries the same bitmap. All mutation is serialized by the write
    lock, so the final contents are independent of how the processes
    interleave.

        # parent and children share one bitmap with no coordination
        my $a = Data::RoaringBitmap::Shared->new(undef, 65536);   # before fork
        unless (fork) { $a->add($_) for 1 .. 100000; exit }
        wait;
        print $a->cardinality, "\n";   # reflects the child's adds

SECURITY
    The mmap region is writable by all processes that open it, and a
    reopened file is trusted: validation checks the header geometry but
    cannot vet every internal container offset against a maliciously crafted
    file. A hostile file could direct a read to an out-of-range container
    slot. Do not share backing files with, or reopen files from, untrusted
    processes.

CRASH SAFETY
    Mutation is guarded by a futex-based write-preferring rwlock with
    PID-encoded ownership; if a holder dies, the next contender detects the
    dead owner and recovers. Because every mutation performs its container
    allocations under the lock after a capacity pre-check, a crash leaves
    the bitmap consistent up to the last completed operation. Limitation:
    PID reuse is not detected (very unlikely in practice).

SEE ALSO
    Data::RadixTree::Shared, Data::DisjointSet::Shared,
    Data::Intern::Shared, Data::SortedSet::Shared,
    Data::SpatialHash::Shared, Data::Histogram::Shared,
    Data::CountMinSketch::Shared, Data::HyperLogLog::Shared,
    Data::BloomFilter::Shared, and the rest of the "Data::*::Shared" family.

AUTHOR
    vividsnow

LICENSE
    This is free software; you can redistribute it and/or modify it under
    the same terms as Perl itself.

