···71717272current car processing times (records processed into their length usize, phil's dev machine):
73737474-- 128MiB CAR file: `350ms`
7474+- 450MiB CAR file (huge): `1.3s`
7575+- 128MiB (huge): `350ms`
7576- 5.0MiB: `6.8ms`
7677- 279KiB: `170us`
7778- 3.4KiB: `5.2us`
···8586static GLOBAL: MiMalloc = MiMalloc;
8687```
87888888-- 128MiB CAR file: `310ms` (-13%)
8989+- 450MiB CAR file: `1.1s` (-15%)
9090+- 128MiB: `310ms` (-13%)
8991- 5.0MiB: `6.1ms` (-10%)
9092- 279KiB: `160us` (-5%)
9193- 3.4KiB: `5.7us` (-9%)
9294- empty: `660ns` (-7%)
93959696+processing CARs requires buffering blocks, so it can consume a lot of memory. repo-stream's in-memory driver has minimal memory overhead, but there are two ways to make it work with less mem (you can do either or both!)
94979595-running the huge-car benchmark
9898+1. spill blocks to disk
9999+2. inline block processing
100100+101101+#### spill blocks to disk
102102+103103+this is a little slower but can greatly reduce the memory used. there's nothing special you need to do for this.
104104+105105+106106+#### inline block processing
107107+108108+if you don't need to store the complete records, you can have repo-stream try to optimistically apply a processing function to the raw blocks as they are streamed in.
109109+110110+111111+#### constrained mem perf comparison
112112+113113+sketchy benchmark but hey. mimalloc is enabled, and the processing spills to disk. inline processing reduces entire records to 8 bytes (usize of the raw record block size):
114114+115115+- 450MiB CAR file: `5.0s` (4.5x slowdown for disk)
116116+- 128MiB: `1.27s` (4.1x slowdown)
117117+118118+fortunately, most CARs in the ATmosphere are very small, so for eg. backfill purposes, the vast majority of inputs will not face this slowdown.
119119+120120+121121+#### running the huge-car benchmark
9612297123- to avoid committing it to the repo, you have to pass it in through the env for now.
98124