checkpoint-lite

- Designed and implemented a concurrent, session-oriented snapshot/restore tool that orchestrates CRIU and OverlayFS to capture both filesystem and in-memory process state.
- Achieved near pure-CRIU snapshotting performance, with total snapshot + restore time of ~4 s for a 2 GB workload — over 6× faster than Podman’s checkpoint workflow in the same use case.
- Added parallelized session handling to support multiple independent snapshot/restore operations with minimal overhead and without heavyweight container abstractions.
- Engineered for integration in agent workflows and general-purpose process management, balancing speed with session isolation.
Challenges / Creative solutions: Built a hybrid state capture mechanism combining CRIU memory dumps with OverlayFS layer tracking to avoid full filesystem duplication while ensuring consistency.