ci: reduce evaluation memory
So, we've done some things here...
- We're no longer evaluating homes - which was basically a double-eval
anyway until we get MacOS/etc. up
- We're splitting system evals apart from each other, which will take
longer over all but reduces the peak memory usage from >10GB to ~3GB
- >10GB was unsustainable for midnight ... we were constantly OOMing
when we accidentally triggered CI twice
- ~3GB is very sustainable for midnight :)