1. We take a lot of care to make sure the AI recommendations are safe and have a high quality bar (regular monitoring, code provenance tracking, adversarial testing, and more).
2. We also do regular A/B tests and randomized control trials to ensure these features are improving SWE productivity and throughput.
3. We see similar efficiencies across all programming languages and frameworks used internally at Google and engineers across all tenure and experience cohorts show similar gain in productivity.
You can read more on our approach here:
https://research.google/blog/ai-in-software-engineering-at-g...
Will AI be able to detect bugs and back doors that require multiple pieces of code working together rather than being in a single piece of code? Humans have a hard time with this.
- Hypothetical Example: Authentication bugs in sshd that requires a flaw in systemd which then requires a flaw in udev or nss or PAM or some underlying library ... but looking at each individual library or daemon there are no bugs that a professional penetration testing organization such as the NCC group or Google's Project Zero would find. In other words, will AI soon be able to find more complex bugs in a year than Tavis has found in his career and will they start to compete with one another and start finding all the state sponsored complex bugs and then ultimately be able to create a map that suggests a common set of developers that may need to be notified? Will there be a table that logs where AI found things that professional human penetration testers could not?
Adversaries are already detecting issues tho, using proven means such as code review and fuzzing.
Google project zero consists of a team of rock star hackers. I don't see LLM even replacing junior devs right now.