--- title: "Getting Started with shard" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with shard} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE ) ``` ```{r load-shard} library(shard) ``` R's parallel tools make it easy to fan out work, but they leave you to manage the hard parts yourself: duplicated memory, runaway workers, invisible copy-on-write. shard handles all of that so you can focus on the computation. The core idea is simple: **share inputs once, write outputs to a buffer, let shard supervise the workers.** ## A first example Suppose you have a large matrix and want to compute column means in parallel. With shard, you share the matrix, allocate an output buffer, and map over column indices: ```{r first-example} set.seed(42) X <- matrix(rnorm(5000), nrow = 100, ncol = 50) # Share the matrix (zero-copy for workers) X_shared <- share(X) # Allocate an output buffer out <- buffer("double", dim = ncol(X)) # Define column shards and run blocks <- shards(ncol(X), workers = 2) run <- shard_map( blocks, borrow = list(X = X_shared), out = list(out = out), workers = 2, fun = function(shard, X, out) { for (j in shard$idx) { out[j] <- mean(X[, j]) } } ) # Read results from the buffer result <- out[] head(result) ``` No serialization of the full matrix per worker. No list of return values to reassemble. The workers wrote directly into `out`. ## The three core objects shard's workflow revolves around three things: | Object | Constructor | Purpose | |:-------|:------------|:--------| | Shared input | `share()` | Immutable, zero-copy data visible to all workers | | Output buffer | `buffer()` | Writable shared memory that workers fill in | | Shard descriptor | `shards()` | Index ranges that partition the work | ### Sharing inputs `share()` places an R object into shared memory. Workers attach to the same segment instead of receiving a copy: ```{r share-input} X_shared <- share(X) is_shared(X_shared) shared_info(X_shared) ``` Shared objects are **read-only**. Any attempt to modify them in a worker raises an error, which prevents silent copy-on-write bugs. ### Output buffers `buffer()` creates typed shared memory that workers write to using standard R indexing: ```{r buffer-demo} buf <- buffer("double", dim = c(10, 5)) buf[1:5, 1] <- rnorm(5) buf[6:10, 1] <- rnorm(5) buf[, 1] ``` Buffers support `"double"`, `"integer"`, `"logical"`, and `"raw"` types. For matrices and arrays, pass a `dim` vector: ```{r buffer-types} int_buf <- buffer("integer", dim = 100) mat_buf <- buffer("double", dim = c(50, 20)) ``` ### Shard descriptors `shards()` partitions a range of indices into chunks for parallel execution. It auto-tunes the block size based on the number of workers: ```{r shards-demo} blocks <- shards(1000, workers = 4) blocks ``` Each shard carries an `idx` field with its assigned indices: ```{r shard-indices} blocks[[1]]$idx[1:10] # first 10 indices of shard 1 ``` ## Running shard_map() `shard_map()` is the engine. It dispatches shards to a supervised worker pool, passes shared inputs, and collects diagnostics: ```{r shard-map-example} set.seed(1) X <- matrix(rnorm(2000), nrow = 100, ncol = 20) X_shared <- share(X) col_sds <- buffer("double", dim = ncol(X)) blocks <- shards(ncol(X), workers = 2) run <- shard_map( blocks, borrow = list(X = X_shared), out = list(col_sds = col_sds), workers = 2, fun = function(shard, X, col_sds) { for (j in shard$idx) { col_sds[j] <- sd(X[, j]) } } ) # Results are already in the buffer sd_values <- col_sds[] # Verify against base R all.equal(sd_values, apply(X, 2, sd)) ``` ### What if workers return values? If your function returns a value (instead of writing to a buffer), shard gathers the results: ```{r return-values} blocks <- shards(10, workers = 2) run <- shard_map( blocks, workers = 2, fun = function(shard) { sum(shard$idx) } ) results(run) ``` Buffers are preferred for large outputs because they avoid serializing results back to the main process. Use return values for small summaries. ## Convenience wrappers For common patterns, shard provides wrappers that handle sharing, sharding, and buffering automatically. ### Column-wise apply `shard_apply_matrix()` applies a scalar function over each column of a matrix: ```{r apply-matrix} set.seed(1) X <- matrix(rnorm(2000), nrow = 100, ncol = 20) y <- rnorm(100) # Correlate each column of X with y cors <- shard_apply_matrix( X, MARGIN = 2, FUN = function(v, y) cor(v, y), VARS = list(y = y), workers = 2 ) head(cors) ``` The matrix is auto-shared, columns are dispatched as shards, and results are collected into a vector. ### List lapply `shard_lapply_shared()` is a parallel `lapply` with automatic sharing of large list elements: ```{r lapply-shared} chunks <- lapply(1:10, function(i) rnorm(100)) means <- shard_lapply_shared( chunks, FUN = mean, workers = 2 ) unlist(means) ``` ## Diagnostics Every `shard_map()` call records timing, memory, and worker statistics. Use `report()` to inspect them: ```{r diagnostics} report(result = run) ``` For focused views: - `mem_report(run)` -- peak and baseline RSS per worker - `copy_report(run)` -- bytes transferred through buffers - `task_report(run)` -- per-chunk execution times and retry counts ## Worker pool management By default, `shard_map()` creates a worker pool on first use and reuses it. You can also manage the pool explicitly: ```{r pool-management, eval = FALSE} # Create a pool with 4 workers and a 1GB memory cap pool_create(n = 4, rss_limit = "1GB") # Check pool health pool_status() # Run multiple shard_map() calls (reuses the same pool) run1 <- shard_map(shards(1000), workers = 4, fun = function(s) sum(s$idx)) run2 <- shard_map(shards(500), workers = 4, fun = function(s) mean(s$idx)) # Shut down workers when done pool_stop() ``` Workers are supervised: if a worker's memory usage drifts beyond the threshold, shard recycles it automatically. ## Copy-on-write protection Shared inputs are immutable by default (`cow = "deny"`). This prevents a common class of parallel bugs where a worker accidentally modifies shared data, triggering a silent copy: ```{r cow-deny, eval = FALSE} shard_map( shards(10), borrow = list(X = share(matrix(1:100, 10, 10))), workers = 2, cow = "deny", fun = function(shard, X) { X[1, 1] <- 999 # Error: mutation denied } ) ``` You can relax this with `cow = "audit"` (detect and report mutations) or `cow = "allow"` (permit copy-on-write with tracking). See `?shard_map` for details. ## Clean up ```{r cleanup, include = FALSE} try(pool_stop(), silent = TRUE) ``` When you are done, stop the pool to release worker processes: ```{r cleanup-show, eval = FALSE} pool_stop() ``` ## Next steps - `?shard_map` -- full reference for the parallel engine - `?share` -- sharing options and backing types - `?buffer` -- buffer types and matrix/array support - `?report` -- diagnostic reports and recommendations - `?shard_apply_matrix` -- column-wise parallel apply - `?pool_create` -- pool configuration and memory limits