Volume Shader BM Methodology

Understand how Volume Shader BM measures GPU performance: rendering pipeline, presets, sampling duration, statistics, and data quality controls.

Test Scene & Rendering Pipeline

Volume Shader BM renders a Mandelbulb fractal using WebGL fragment shaders and ray marching. Each pixel traces a ray through 3D space, evaluates the distance estimator, and shades the final hit point. The camera orbits continuously to keep the workload consistent across devices and runs.

The benchmark focuses on GPU shader throughput, math instruction efficiency, and fill-rate limits. CPU work is minimal and primarily coordinates frame submission.

Presets & Resolution Scaling

Presets (Ultra Low → Very High) adjust kernel iterations, step size, and resolution scale to target different GPU tiers. Mobile and desktop devices use preset values calibrated for their expected performance ranges to keep comparisons fair.

Resolution scale multiplies the base render size (1024×1024). Higher values increase detail but raise GPU load non-linearly. Kernel iterations and step size balance fractal precision against frame rate.

Sampling Duration & Statistics

A single run captures frame time samples over a short window to calculate FPS, min/max FPS, and stability. Leaderboard entries aggregate multiple submissions per GPU, reporting median (p50) FPS by default and tracking sample count (N).

Extreme outliers are filtered by removing invalid or unstable submissions and excluding duplicates from the same session.

WebGL2 vs WebGPU

The benchmark defaults to WebGL2 for broad compatibility. When WebGPU is available, it can be used to capture newer GPU capabilities, but workloads are kept comparable. Results are labeled by API so you can compare like-for-like.

Factors That Impact Scores

  • Browser and driver versions (shader compiler differences).
  • Thermal throttling and power mode (especially on laptops and phones).
  • Background workloads (other tabs, apps, or system tasks).
  • Resolution scale and preset adjustments.

Data Quality Controls

Submissions are grouped by device type and preset. Duplicate sessions and obviously invalid samples are excluded. Each leaderboard entry surfaces sample count, last update time, and the statistic used for ranking.

Data source labels are intentionally conservative. Live data means the number comes from submitted benchmark samples. Curated baseline means a static reference value is used while live coverage is missing. Estimated range means p10/p90 values are derived from a baseline and should not be read as measured distribution.

Confidence depends on valid sample count: high confidence requires at least 10 valid submissions, medium confidence requires 3-9 submissions, and low confidence means fewer than 3 valid submissions. Low-confidence entries remain useful as early signals but should not be treated as stable rankings.

Dataset Downloads

Download the latest leaderboard samples and curated device performance data in CSV or JSON format: