Only because nvidia stoped people using nvidia cards as a dedicated physx card with amd main card.. Which is the definition of anticompetative.
I personally don't mind. If one end has the better product, I'll pay for it (within limits), and for now, PhysX is well within said limits at this point in time.
You're comparing architecture advantages with extensions. I don't think you actually understand what you're talking about. Architecture is standardized to a point where it can be called x86 or not. Extensions are exact and strict stuff, you can't bend things around. You can just tailor architecture to be more efficient with given extensions, but you can't make them different and still have them the same.
The ISA is standard, the architecture is not. In terms of just x86, and just Intel, we currently have 4, completely independent, fully x86 (with extensions) ISA-compliant architectures with staggeringly large differences in performance: Silvermont/Goldmont Atom chipsets, Broadwell (under Broadwell-EX/-E/Broadwell-D), Skylake (under Skylake-H/-S/-U/-Y) and Knights Landing (Xeon Phi, loosely Goldmont-based). All of them will run up to at least SSE4 code, and all of them will run the exact same code with vastly different performance results.
Probably the most relevant comparison to illustrate is the Broadwell-E vs KNL comparison. BDW-E is your usual general purpose chip, at at the full ~650mm² has only 24 cores, while KNL has 72 cores in the same ~650mm² die size at the same 14nm node. In performance terms, the BDW chip destroys the KNL chip for the typical CPU loads (databases, general workstation, web servers, games, etc), but the moment you point optimize, highly-thread AVX code at them, the KNL chip will simply be magnitudes faster (GFLOPS vs TFLOPS) than the poor Broadwell chip. In conclusion, they are very different core designs, targeted at very different applications.
If you want an ARM example, just look at the huge range of ARMv6, v7 and v8 implementations out there. Any code targeted for ARMvL will work on any implementation that implements the ARMvL ISA, so ARMv6 code will run on any ARMv6 chip, etc upto ARMv8.
What that means, is that code targeting the base ARMv8 ISA will run on everything from a relatively massive Qualcomm Kryo or Apple Twister implementation all the way down to the tiny, super-low-power Cortex-A35. The big chips will be 3+ times faster than the little A35.
As for physics, it has already been established that GPU's are superior in that regard. It's why fluid dynamics are a reality on GPU's and just not usable on CPU's.
For games, you have the interesting proposition of also needing to run the rest of rendering on the GPU (polygons, texturing, filtering, occlusion), so it may be more effective to do the work on the CPU. Eitherways, this is not relevant to the discussion that the AMD/nV perf difference lying pretty much entirely in the different architectures. I only brought up that particular example because I knew that the first thing that would come from using AVX as an example would be that no games used it. Path uses AVX, and uses it to great effect. How wise/optimal that decision is, I don't really give a shit since my framerates have quadrupled since the last major patch.