AWS Award: Finalist, 2021 AWS Startup Architecture of the Year Competition

Updated: Mar 3, 2023

EMR + Windjammer Spark Accelerator

Industry leading SQL query throughput per vCPU

= Transparent Acceleration and Reduction in Slaves

Why EMR Spark Acceleration?

Massive scale of use of EMR Spark
Spark’s JVM is very CPU intensive causing server sprawl, performance instability & management challenges
Spark does not fully exploit high bandwidth of today’s cloud storage systems, causing high query run times
Standard Spark fault tolerance requires persisting data at shuffle boundaries, motivating complex and expensive shuffle services

Windjammer EMR Spark Accelerator

More efficient use of expensive CPU resources: Native execution, MPP (massively parallel processing) dataflow clustered architecture eliminating JVM bottlenecks
Fully exploits S3 cloud storage bandwidth: Aggressive parallel,asynchronous prefetch of analytics data sets
Eliminates need for complex shuffle service: Checkpoint-based fault-tolerance uses reliable, high bandwidth S3 cloud storage: no need for special shuffle service while providing full query fault tolerance including spot instance and cluster interruptions
Transparent, 100% compatible