Independently Verified on Kaggle

25.5 DAYS to 2.7 MIN

13,848x faster. Both EXACT.

Run on Kaggle pip install simgen-vla

13,848x

Speedup

104M

Elements Exact

Precision Loss

Free

pip install

Execution Time Comparison

CPU Decimal takes DAYS. VLA takes SECONDS. Both give EXACT results.

Matrix	Elements	CPU Decimal	VLA GPU	Speedup
512x512	262K	4.6 min	0.05s	6,071x
1024x1024	1M	36.7 min	0.2s	12,922x
2048x2048	4M	4.9 hrs	1.3s	13,814x
4096x4096	17M	1.6 DAYS	10.1s	13,934x
6144x6144	38M	5.5 DAYS	34.3s	13,885x
8192x8192	67M	13.1 DAYS	1.4 min	13,843x
10240x10240	105M	25.5 DAYS	2.7 min	13,848x

Benchmarked on Tesla T4 (Kaggle), February 2026

Cross-GPU Reproducibility

Same checksum on completely different architectures. This is unprecedented.

10240x10240 Matrix Multiply Checksum

RTX 4070 (Ada Lovelace, sm_89)

6ece6956f187064f

Tesla T4 (Turing, sm_75)

6ece6956f187064f

BIT-IDENTICAL

Different GPU architectures, different memory layouts, same exact result

100%

Reproducible

GPU architectures verified

Bit differences

VLA Beats Everything

Test: 1e20 + 10,000 ones - 1e20. Expected result: 10,000. Only VLA gets it right.

FP32

8,750

Lost 1,250

FP64

7,500

Lost 2,500

80-bit Extended

9,984

Lost 16

VLA

10,000

EXACT

VLA beats Intel 80-bit extended precision hardware - on a consumer GPU.

FP64 Loses Tens of Thousands of Values

VLA recovers ALL of them.

Test	Expected	FP64 Lost
1e20 + 10K - 1e20	9,998	1,262 (12.6%)
1e20 + 100K - 1e20	99,998	25,022 (25.0%)
1e20 + 500K - 1e20	499,998	24,862 (5.0%)
1e20 + 1M - 1e20	999,998	33,342 (3.3%)

Real-World Impact

Financial Transactions

$881,143,573.77

1 million transactions summed

FP64 error: $0.0000001VLA error: $0.00

Patriot Missile Tracking

100 Hours

0.1s increments accumulated

FP64 miss: 0.0000002mVLA miss: 0.0m

Lorenz Chaos System

50,000 Steps

Chaotic trajectory integration

FP64: Standard trajectoryVLA: 0.0 divergence

Orbit Propagation

10 Orbits

ISS-altitude satellite tracking

FP64 drift: 4.29 kmVLA drift: 4.67 km

Try It Yourself

All benchmarks are reproducible. Run them on Kaggle or install locally.

$ pip install simgen-vla

Run on Kaggle Install VLA

Learn about VLA See use cases View pricing