Skip to Content
InternalDocsBenchmarksShip Ingest Perf

Ship Ingest Perf

Source: docs/benchmarks/ship-ingest-perf.md

# SHIP CSV Ingestion Performance Benchmark ## Purpose Compare CSV parsing performance between `main` (sync/in-memory parser) and PR #172 (`codex/p2a-c-ingestion-streaming`, streaming parser). The streaming parser (`parseShipmentCSVFromFile`) reads from disk row-by-row instead of loading the entire file into memory, targeting reduced peak memory usage at scale. ## What Changed (PR #172) - Added `parseShipmentCSVFromFile()` using `csv-parse` async iterator (stream mode) - Refactored `processShipUploadJob` in `ingest.ts` to call the file-based parser directly (no `readFile` call) - Extracted shared `parseShipmentRow()` helper to keep behavior parity between sync and stream parsers ## Test Methodology ### Parser-Only Benchmark The benchmark measures the CSV parser in isolation, without DB writes, API server, or detection logic. This isolates the parsing and memory allocation behavior. 1. **Data generation:** `scripts/generate-ship-csv.ts` creates realistic CSV files with configurable row counts. Columns match the expected SHIP CSV format. ~2% of rows contain intentional validation errors (missing fields, bad dates) to exercise error paths. 2. **Measurement approach:** - **Wall time:** `performance.now()` around parser call - **Heap delta:** `process.memoryUsage().heapUsed` before/after parsing - **Peak RSS:** `/usr/bin/time -l` (macOS) or `/usr/bin/time -v` (Linux) - **Throughput:** rows parsed / wall time 3. **Branch comparison:** The script checks out each branch, installs dependencies, and runs the parser. On `main`, it uses `parseShipmentCSV` (sync, full-file read). On the streaming branch, it uses `parseShipmentCSVFromFile` (stream from file). 4. **Repetition:** Each configuration runs 3 times (configurable via `RUNS` env var). Median values are reported. ### Running the Benchmark ```bash # Default: 100k + 1M rows, 3 runs each, main vs streaming branch ./scripts/bench-ship-ingest.sh # Custom row counts ROW_COUNTS="100000 500000 1000000" ./scripts/bench-ship-ingest.sh # Custom branches BRANCHES="main my-feature-branch" ./scripts/bench-ship-ingest.sh # More runs for statistical significance RUNS=5 ./scripts/bench-ship-ingest.sh ``` ### Generating Test Data Only ```bash # 100k rows npx tsx scripts/generate-ship-csv.ts 100000 /tmp/ship-100k.csv # 1M rows npx tsx scripts/generate-ship-csv.ts 1000000 /tmp/ship-1m.csv # Custom error rate ERROR_RATE=0.05 npx tsx scripts/generate-ship-csv.ts 100000 /tmp/ship-100k-5pct-errors.csv ``` ## Final Validation Results (2026-02-13) This is the final P2A-C done-metric validation run after PR #185 merged. It validates current `main` at 1M rows. Command: ```bash BRANCHES="main" ROW_COUNTS="1000000" RUNS=3 ./scripts/bench-ship-ingest.sh ``` ### Environment | Property | Value | |----------|-------| | Machine | arm64 MacBook (Darwin 25.2.0) | | CPU | Apple M4 | | RAM | 24 GB | | Node.js | v24.10.0 | | OS | macOS (Darwin Kernel 25.2.0) | | Date | 2026-02-13 20:39 UTC | ### Raw Runs (1M rows, `main`) | Run | Wall Time (ms) | Heap Delta (MB) | Peak RSS (MB) | Rows Parsed | Invalid Rows | Rows/sec | |-----|----------------|-----------------|---------------|-------------|--------------|----------| | 1 | 6,028 | 1,652.3 | 1,526.9 | 979,991 | 20,009 | 162,573 | | 2 | 5,930 | 1,677.4 | 1,721.4 | 979,991 | 20,009 | 165,259 | | 3 | 5,898 | 1,666.3 | 1,567.2 | 979,991 | 20,009 | 166,156 | ### Median Summary (3 runs) | Branch | Row Count | Median Wall (ms) | Median Heap Delta (MB) | Median Peak RSS (MB) | Median Rows/sec | |--------|-----------|------------------|------------------------|----------------------|-----------------| | `main` | 1,000,000 | 5,930 | 1,666.3 | 1,567.2 | 165,259 | ### Notes - This run is parser-only (no DB writes, no API server, no detection pipeline), matching the benchmark method. - `Invalid Rows` is the parser's skipped/error row count for the run. - CSV had ~2% intentional invalid rows; parsed row count and error count were stable across runs. - Prior branch-to-branch comparison work was captured during PR #174; this final run is the closeout validation on merged `main`. ## Key Metrics to Watch | Metric | Why It Matters | |--------|---------------| | **Peak RSS at 1M rows** | Main goal of streaming: avoid loading entire file into memory | | **Wall time parity** | Streaming should not be significantly slower | | **Rows/sec consistency** | Throughput should scale linearly with row count | | **Error count parity** | Both parsers should report identical error counts | ## Caveats - This benchmarks the parser only. Full ingestion (DB upserts, detection, rollup) adds significant time. - Peak RSS includes Node.js runtime + tsx transpilation overhead (~50-80 MB baseline). - The streaming parser still accumulates all `ParsedShipment` objects in memory (it does not stream them to DB). The memory savings come from not loading the raw CSV string into memory. - For true streaming-to-DB, a future iteration would need to yield batches from the parser and write them incrementally. ## Files | File | Purpose | |------|---------| | `scripts/generate-ship-csv.ts` | CSV test data generator | | `scripts/bench-ship-ingest.sh` | Benchmark orchestrator | | `docs/benchmarks/ship-ingest-perf.md` | This file (results template) |