We're approaching the limits of physics and transistor/voltage reduction.
In the past you could use far more advanced nodes/fabrication processes to improve performance basically for free by running faster/having more transistors while increasing the power budget only slightly. Also, hardware algorithms were advancing (remember Maxwell) fast - those were low-hanging fruits.
Nowadays, architectures are advancing a lot slower, nodes provide a lot less power improvements and if you want to have a decent generational uplift you have no choice but to up the power budget significantly.
This has been discussed in depth in many publications over the past decade. I'm surprised people are asking this question.