personally I would rather pay $15 then have ads or personal information shared. Or use nothing
Pay 15 bucks and your anonymized data will still be shared
The reason why more and more will push for cloud analysis is because statistical (big) data is more effective at combating malware than old ways of brute analyzing each sample individually. These days, they are literally just shoving files into their systems (deep learners if you want) and they grind through files, sorting them based on several criteria. Then you basically get two groups of files one malicious and another clean. Systems measure the distance between each based on several factors when they introduce new samples to the system. The more you feed the system with data to process, the more accurate becomes.
Imagine red being malware, green being clean files and each circle, depending on size designates a group of files with similar characteristics. As you insert file with unknown reputation, system tries to validate its characteristics with everything they already have in the massive database. As system is crunching through, the file reputation position will move across the map as it's matching how file looks and behaves. The distance between each sample group and distance between clean and malicious distinct groups designates how to treat the file. Operator (AV company) can simply fiddle with parameters that define the distances and they can adjust the system to be even more accurate while even less prone to false positives. If we had tech like this years ago, we would eradicate malware entirely.
The dark orange and light orange is the "no mans land" zone where system has greatest difficulty declaring what the file is and that's where most adjustment is needed. Some samples even get forwarded for human interaction because the system is indecisive or because it has certain characteristic that requires human analyst interaction.
That's a very dumbed down version of how anti-malware deep learning works. Somewhere deep in that system, some of your personal data resides as well. The thing is, this whole thing is just WAY too massive for anyone to fiddle with data of individuals on personal level. And the system generally operates in such a way that data gets anonymized client side as much as possible so companies really only get aggregated statistical data.