Power optimization techniques in a VLSI flow typically end up being the performance bottlenecks leading to a large turn around time for the following reasons: * Scalability: The design typically spans millions and millions of gates with different operating conditions leading to a large search space. * Portability: The constraints vary across technology nodes hindering reusability of solutions. ML models are inherently trained to operate on large datasets and navigate a complex search space. The contributions of our work are as follows. * We propose a novel learning (Support Vector Machine) based classifier, which provides a good initial design configuration that guarantees leakage optimal solution. * We use a Lazy Timing Analysis procedure that postpones the timing validation step as much as possible. * We show the efficiency of our technique on large scale benchmark datasets (25K - 1million gates). Our technique performs 23% better in terms of solution quality when compared with the state-of-the art technique and 50% better in terms of runtime.