Xbestpp ✦ Simple

Apply with:

[[xbestpp::hot(iterations=1000000)]] void compute() ... Then run: xbestpp

[profiling] events = ["cycles", "cache-misses", "instructions"] duration = 10 # seconds [optimization] max_unroll = 8 allow_fp_contract = true gpu_grid_size = [256, 1, 1] xbestpp

Function Baseline (ms) Optimized (ms) Speedup matrix_multiply 342.12 189.44 1.81x 5.1 Targeted tuning via annotation Add to your C++ code: xbestpp