How Data 360 Optimized Kubernetes Scheduling Architecture, Delivering 13% Cost Savings
Padma Aradhyula’s team at Salesforce Data 360 tackled the problem of inefficient Kubernetes scheduling which caused node fragmentation and increased costs for running millions of Spark jobs daily. They replaced the default LeastAllocated scheduling strategy with a proactive MostAllocated approach that packs executor pods densely, reducing wasted idle capacity and minimizing disruptive node evictions. This optimization improved resource utilization by 15%, cut compute costs by 13%, and significantly enhanced reliability by halving node disruptions. Salesforce teams managing large-scale Spark workloads can adopt similar custom scheduling logic to boost efficiency and stability in their Kubernetes environments.
- Replace default Kubernetes scheduler LeastAllocated with MostAllocated for Spark workloads.
- Proactively pack executor pods on fewer nodes to reduce fragmentation and idle capacity.
- Avoid reactive autoscaler-driven consolidation to prevent costly executor evictions.
- Monitor workload stability when increasing node utilization to ensure job SLA compliance.
- Embed workload-aware placement logic to co-locate pods belonging to the same Spark job.
By Padma Aradhyula, Dongwei Feng, Siddharth Sharma, and Anuja Gore. In our Engineering Energizers Q&A series, we highlight the engineering minds driving innovation across Salesforce. Today, we spotlight Padma Aradhyula, Senior Director of Software Engineering on the Data 360 Compute Fabric team, who manages a large-scale platform orchestrating four million Spark applications daily, with nearly 2 million of them on Kubernetes. Explore how Padma’s team optimized infrastructure cost at global scale by evolving Kubernetes scheduler behavior to eliminate node fragmentation under bursty Spark workloads, redesigning placement logic to proactively consolidate executor pods onto fewer nodes and embedding efficiency directly into the scheduling layer to resolve the reliability tension created by reactive autoscaler-driven node churn.