LLMs: An (im)perfectly human approach to secure coding?
A version of this article appeared in Dark Reading. It has been updated and syndicated here.
From the first rumbles of hype for the latest, culture-shattering AI tools, developers and the coding-curious alike have been using them to generate code at the touch of a button. Security experts quickly pointed out that, in many cases, the code being produced was poor quality and vulnerable, and in the hands of those with little security awareness, could cause an avalanche of insecure apps and web development to hit unsuspecting consumers.
And then, there are those who have enough security knowledge to use it for, well, evil. For every mindblowing AI feat, it seems there is a counter-punch of the same technology being used for nefarious purposes. Phishing, deep fake scam videos, malware creation, general script kiddie shenanigans… these disruptive activities are achievable much faster, with lower barriers to entry.
There is certainly a lot of clickbait touting this tooling as revolutionary, or at least coming out on top when matched with “average” human skill. While it is looking inevitable that LLM-style AI technology will change the way we approach many aspects of work - not just software development - we must take a step back and consider the risks beyond the headlines.
And as a coding companion, its flaws are perhaps its most “human” attribute.
Poor coding patterns dominate its go-to solutions
With ChatGPT trained on decades of existing code and knowledge bases, it's no surprise that for all its marvel and mystery, it too suffers from the same common pitfalls people face when navigating code. Poor coding patterns are the go-to, and it still takes a security-aware driver to generate secure coding examples by asking the right questions and delivering the right prompt engineering.
Even then, there is no guarantee that the code snippets given are accurate and functional from a security perspective; the technology is prone to hallucination, even making up non-existent libraries when asked to perform some specific JSON operations, as discovered by Mike Shema. This could lead to “hallucination squatting” by threat actors, who would be all too happy to spin up some malware disguised as the fabricated library recommended with full confidence by ChatGPT.
Ultimately, we have to face the reality that, in general, we have not expected developers to be sufficiently security-aware, nor have we as an industry adequately prepared them to write secure code as a default state. This will be evident in the enormous amount of training data fed into ChatGPT, and we can expect similar lackluster security results from its output, at least initially. Developers would have to be able to identify the security bugs, and either fix them themselves or design better prompts for a more robust outcome.
The first large-scale user study examining how users interact with an AI coding assistant to solve a variety of security-related functions - conducted by researchers at Stanford University - supports this notion, with one observation concluding:
“We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet also more likely to rate their insecure answers as secure compared to those in our control group.”
This speaks to a level of default trust in the output of AI coding tools as producing code that is always inherently secure, when in fact it is not.
Between this and the inevitable AI-borne threats that will permeate our future, now more than ever, developers must hone their security skills and raise the bar for code quality no matter its origin.
The road to a data breach disaster is paved with good intentions
It should come as no surprise that AI coding companions are popular, especially as developers are faced with increasing responsibility, tighter deadlines, and the ambitions of a company’s innovation resting on their shoulders. However, even with the best intentions, a lack of actionable security awareness when using AI for coding will inevitably lead to glaring security problems. All developers with AI/ML tooling will generate more code, and its level of security risk will depend on their skill level. Organizations need to be acutely aware that untrained people will certainly generate code faster, but so too will they increase the speed of technical security debt.
Even our preliminary test (April 2023) with ChatGPT has revealed it will generate very basic mistakes that could have devastating consequences. When we asked it to build a login routine in PHP using a MySQL database, functional code was generated quickly. However, it defaulted to storing passwords in plaintext in a database, storing database connection credentials in code, and using a coding pattern that could result in SQL injection (although, it did do some level of filtering on the input parameters and spitting out database errors). All rookie errors by any measure:
Further prompting ensured the mistakes were amended, but it takes significant security knowledge to course-correct. Unchecked and widespread use of these tools is no better than unleashing junior developers onto your projects, and if this code is building sensitive infrastructure or processing personal data, then we’re looking at a ticking time bomb.
Of course, just like junior developers undoubtedly increase their skills over time, we expect AI/ML capabilities to improve. A year from now, it may not make such obvious and simple security mistakes. However, that will have the effect of dramatically increasing the security skill required to track down the more serious, hidden, non-trivial security errors it is still in danger of producing.
We remain ill-prepared to find and fix security vulnerabilities, and AI widens the gap
While there has been much talk of “shifting left” for many years at this point, the fact remains that, for most organizations, there is a significant lack of practical security knowledge among the development cohort, and we must work harder to provide the right-fit tools and education to help them on their way.
As it stands, we’re not prepared for the security bugs we are accustomed to encountering, not to mention the new AI-borne issues like prompt injection and hallucination squatting that represent entirely new attack vectors that are set to take off like wildfire. AI coding tools do represent the future of a developer’s coding arsenal, but the education to wield these productivity weapons safely must come now.
While it is looking inevitable that LLM-style AI technology will change the way we approach many aspects of work - not just software development - we must take a step back and consider the risks beyond the headlines. And as a coding companion, its flaws are perhaps its most “human” attribute.
Chief Executive Officer, Chairman, and Co-Founder
Secure Code Warrior is here for your organization to help you secure code across the entire software development lifecycle and create a culture in which cybersecurity is top of mind. Whether you’re an AppSec Manager, Developer, CISO, or anyone involved in security, we can help your organization reduce risks associated with insecure code.
Book a demoChief Executive Officer, Chairman, and Co-Founder
Pieter Danhieux is a globally recognized security expert, with over 12 years experience as a security consultant and 8 years as a Principal Instructor for SANS teaching offensive techniques on how to target and assess organizations, systems and individuals for security weaknesses. In 2016, he was recognized as one of the Coolest Tech people in Australia (Business Insider), awarded Cyber Security Professional of the Year (AISA - Australian Information Security Association) and holds GSE, CISSP, GCIH, GCFA, GSEC, GPEN, GWAPT, GCIA certifications.
A version of this article appeared in Dark Reading. It has been updated and syndicated here.
From the first rumbles of hype for the latest, culture-shattering AI tools, developers and the coding-curious alike have been using them to generate code at the touch of a button. Security experts quickly pointed out that, in many cases, the code being produced was poor quality and vulnerable, and in the hands of those with little security awareness, could cause an avalanche of insecure apps and web development to hit unsuspecting consumers.
And then, there are those who have enough security knowledge to use it for, well, evil. For every mindblowing AI feat, it seems there is a counter-punch of the same technology being used for nefarious purposes. Phishing, deep fake scam videos, malware creation, general script kiddie shenanigans… these disruptive activities are achievable much faster, with lower barriers to entry.
There is certainly a lot of clickbait touting this tooling as revolutionary, or at least coming out on top when matched with “average” human skill. While it is looking inevitable that LLM-style AI technology will change the way we approach many aspects of work - not just software development - we must take a step back and consider the risks beyond the headlines.
And as a coding companion, its flaws are perhaps its most “human” attribute.
Poor coding patterns dominate its go-to solutions
With ChatGPT trained on decades of existing code and knowledge bases, it's no surprise that for all its marvel and mystery, it too suffers from the same common pitfalls people face when navigating code. Poor coding patterns are the go-to, and it still takes a security-aware driver to generate secure coding examples by asking the right questions and delivering the right prompt engineering.
Even then, there is no guarantee that the code snippets given are accurate and functional from a security perspective; the technology is prone to hallucination, even making up non-existent libraries when asked to perform some specific JSON operations, as discovered by Mike Shema. This could lead to “hallucination squatting” by threat actors, who would be all too happy to spin up some malware disguised as the fabricated library recommended with full confidence by ChatGPT.
Ultimately, we have to face the reality that, in general, we have not expected developers to be sufficiently security-aware, nor have we as an industry adequately prepared them to write secure code as a default state. This will be evident in the enormous amount of training data fed into ChatGPT, and we can expect similar lackluster security results from its output, at least initially. Developers would have to be able to identify the security bugs, and either fix them themselves or design better prompts for a more robust outcome.
The first large-scale user study examining how users interact with an AI coding assistant to solve a variety of security-related functions - conducted by researchers at Stanford University - supports this notion, with one observation concluding:
“We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet also more likely to rate their insecure answers as secure compared to those in our control group.”
This speaks to a level of default trust in the output of AI coding tools as producing code that is always inherently secure, when in fact it is not.
Between this and the inevitable AI-borne threats that will permeate our future, now more than ever, developers must hone their security skills and raise the bar for code quality no matter its origin.
The road to a data breach disaster is paved with good intentions
It should come as no surprise that AI coding companions are popular, especially as developers are faced with increasing responsibility, tighter deadlines, and the ambitions of a company’s innovation resting on their shoulders. However, even with the best intentions, a lack of actionable security awareness when using AI for coding will inevitably lead to glaring security problems. All developers with AI/ML tooling will generate more code, and its level of security risk will depend on their skill level. Organizations need to be acutely aware that untrained people will certainly generate code faster, but so too will they increase the speed of technical security debt.
Even our preliminary test (April 2023) with ChatGPT has revealed it will generate very basic mistakes that could have devastating consequences. When we asked it to build a login routine in PHP using a MySQL database, functional code was generated quickly. However, it defaulted to storing passwords in plaintext in a database, storing database connection credentials in code, and using a coding pattern that could result in SQL injection (although, it did do some level of filtering on the input parameters and spitting out database errors). All rookie errors by any measure:
Further prompting ensured the mistakes were amended, but it takes significant security knowledge to course-correct. Unchecked and widespread use of these tools is no better than unleashing junior developers onto your projects, and if this code is building sensitive infrastructure or processing personal data, then we’re looking at a ticking time bomb.
Of course, just like junior developers undoubtedly increase their skills over time, we expect AI/ML capabilities to improve. A year from now, it may not make such obvious and simple security mistakes. However, that will have the effect of dramatically increasing the security skill required to track down the more serious, hidden, non-trivial security errors it is still in danger of producing.
We remain ill-prepared to find and fix security vulnerabilities, and AI widens the gap
While there has been much talk of “shifting left” for many years at this point, the fact remains that, for most organizations, there is a significant lack of practical security knowledge among the development cohort, and we must work harder to provide the right-fit tools and education to help them on their way.
As it stands, we’re not prepared for the security bugs we are accustomed to encountering, not to mention the new AI-borne issues like prompt injection and hallucination squatting that represent entirely new attack vectors that are set to take off like wildfire. AI coding tools do represent the future of a developer’s coding arsenal, but the education to wield these productivity weapons safely must come now.
A version of this article appeared in Dark Reading. It has been updated and syndicated here.
From the first rumbles of hype for the latest, culture-shattering AI tools, developers and the coding-curious alike have been using them to generate code at the touch of a button. Security experts quickly pointed out that, in many cases, the code being produced was poor quality and vulnerable, and in the hands of those with little security awareness, could cause an avalanche of insecure apps and web development to hit unsuspecting consumers.
And then, there are those who have enough security knowledge to use it for, well, evil. For every mindblowing AI feat, it seems there is a counter-punch of the same technology being used for nefarious purposes. Phishing, deep fake scam videos, malware creation, general script kiddie shenanigans… these disruptive activities are achievable much faster, with lower barriers to entry.
There is certainly a lot of clickbait touting this tooling as revolutionary, or at least coming out on top when matched with “average” human skill. While it is looking inevitable that LLM-style AI technology will change the way we approach many aspects of work - not just software development - we must take a step back and consider the risks beyond the headlines.
And as a coding companion, its flaws are perhaps its most “human” attribute.
Poor coding patterns dominate its go-to solutions
With ChatGPT trained on decades of existing code and knowledge bases, it's no surprise that for all its marvel and mystery, it too suffers from the same common pitfalls people face when navigating code. Poor coding patterns are the go-to, and it still takes a security-aware driver to generate secure coding examples by asking the right questions and delivering the right prompt engineering.
Even then, there is no guarantee that the code snippets given are accurate and functional from a security perspective; the technology is prone to hallucination, even making up non-existent libraries when asked to perform some specific JSON operations, as discovered by Mike Shema. This could lead to “hallucination squatting” by threat actors, who would be all too happy to spin up some malware disguised as the fabricated library recommended with full confidence by ChatGPT.
Ultimately, we have to face the reality that, in general, we have not expected developers to be sufficiently security-aware, nor have we as an industry adequately prepared them to write secure code as a default state. This will be evident in the enormous amount of training data fed into ChatGPT, and we can expect similar lackluster security results from its output, at least initially. Developers would have to be able to identify the security bugs, and either fix them themselves or design better prompts for a more robust outcome.
The first large-scale user study examining how users interact with an AI coding assistant to solve a variety of security-related functions - conducted by researchers at Stanford University - supports this notion, with one observation concluding:
“We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet also more likely to rate their insecure answers as secure compared to those in our control group.”
This speaks to a level of default trust in the output of AI coding tools as producing code that is always inherently secure, when in fact it is not.
Between this and the inevitable AI-borne threats that will permeate our future, now more than ever, developers must hone their security skills and raise the bar for code quality no matter its origin.
The road to a data breach disaster is paved with good intentions
It should come as no surprise that AI coding companions are popular, especially as developers are faced with increasing responsibility, tighter deadlines, and the ambitions of a company’s innovation resting on their shoulders. However, even with the best intentions, a lack of actionable security awareness when using AI for coding will inevitably lead to glaring security problems. All developers with AI/ML tooling will generate more code, and its level of security risk will depend on their skill level. Organizations need to be acutely aware that untrained people will certainly generate code faster, but so too will they increase the speed of technical security debt.
Even our preliminary test (April 2023) with ChatGPT has revealed it will generate very basic mistakes that could have devastating consequences. When we asked it to build a login routine in PHP using a MySQL database, functional code was generated quickly. However, it defaulted to storing passwords in plaintext in a database, storing database connection credentials in code, and using a coding pattern that could result in SQL injection (although, it did do some level of filtering on the input parameters and spitting out database errors). All rookie errors by any measure:
Further prompting ensured the mistakes were amended, but it takes significant security knowledge to course-correct. Unchecked and widespread use of these tools is no better than unleashing junior developers onto your projects, and if this code is building sensitive infrastructure or processing personal data, then we’re looking at a ticking time bomb.
Of course, just like junior developers undoubtedly increase their skills over time, we expect AI/ML capabilities to improve. A year from now, it may not make such obvious and simple security mistakes. However, that will have the effect of dramatically increasing the security skill required to track down the more serious, hidden, non-trivial security errors it is still in danger of producing.
We remain ill-prepared to find and fix security vulnerabilities, and AI widens the gap
While there has been much talk of “shifting left” for many years at this point, the fact remains that, for most organizations, there is a significant lack of practical security knowledge among the development cohort, and we must work harder to provide the right-fit tools and education to help them on their way.
As it stands, we’re not prepared for the security bugs we are accustomed to encountering, not to mention the new AI-borne issues like prompt injection and hallucination squatting that represent entirely new attack vectors that are set to take off like wildfire. AI coding tools do represent the future of a developer’s coding arsenal, but the education to wield these productivity weapons safely must come now.
Click on the link below and download the PDF of this resource.
Secure Code Warrior is here for your organization to help you secure code across the entire software development lifecycle and create a culture in which cybersecurity is top of mind. Whether you’re an AppSec Manager, Developer, CISO, or anyone involved in security, we can help your organization reduce risks associated with insecure code.
View reportBook a demoChief Executive Officer, Chairman, and Co-Founder
Pieter Danhieux is a globally recognized security expert, with over 12 years experience as a security consultant and 8 years as a Principal Instructor for SANS teaching offensive techniques on how to target and assess organizations, systems and individuals for security weaknesses. In 2016, he was recognized as one of the Coolest Tech people in Australia (Business Insider), awarded Cyber Security Professional of the Year (AISA - Australian Information Security Association) and holds GSE, CISSP, GCIH, GCFA, GSEC, GPEN, GWAPT, GCIA certifications.
A version of this article appeared in Dark Reading. It has been updated and syndicated here.
From the first rumbles of hype for the latest, culture-shattering AI tools, developers and the coding-curious alike have been using them to generate code at the touch of a button. Security experts quickly pointed out that, in many cases, the code being produced was poor quality and vulnerable, and in the hands of those with little security awareness, could cause an avalanche of insecure apps and web development to hit unsuspecting consumers.
And then, there are those who have enough security knowledge to use it for, well, evil. For every mindblowing AI feat, it seems there is a counter-punch of the same technology being used for nefarious purposes. Phishing, deep fake scam videos, malware creation, general script kiddie shenanigans… these disruptive activities are achievable much faster, with lower barriers to entry.
There is certainly a lot of clickbait touting this tooling as revolutionary, or at least coming out on top when matched with “average” human skill. While it is looking inevitable that LLM-style AI technology will change the way we approach many aspects of work - not just software development - we must take a step back and consider the risks beyond the headlines.
And as a coding companion, its flaws are perhaps its most “human” attribute.
Poor coding patterns dominate its go-to solutions
With ChatGPT trained on decades of existing code and knowledge bases, it's no surprise that for all its marvel and mystery, it too suffers from the same common pitfalls people face when navigating code. Poor coding patterns are the go-to, and it still takes a security-aware driver to generate secure coding examples by asking the right questions and delivering the right prompt engineering.
Even then, there is no guarantee that the code snippets given are accurate and functional from a security perspective; the technology is prone to hallucination, even making up non-existent libraries when asked to perform some specific JSON operations, as discovered by Mike Shema. This could lead to “hallucination squatting” by threat actors, who would be all too happy to spin up some malware disguised as the fabricated library recommended with full confidence by ChatGPT.
Ultimately, we have to face the reality that, in general, we have not expected developers to be sufficiently security-aware, nor have we as an industry adequately prepared them to write secure code as a default state. This will be evident in the enormous amount of training data fed into ChatGPT, and we can expect similar lackluster security results from its output, at least initially. Developers would have to be able to identify the security bugs, and either fix them themselves or design better prompts for a more robust outcome.
The first large-scale user study examining how users interact with an AI coding assistant to solve a variety of security-related functions - conducted by researchers at Stanford University - supports this notion, with one observation concluding:
“We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet also more likely to rate their insecure answers as secure compared to those in our control group.”
This speaks to a level of default trust in the output of AI coding tools as producing code that is always inherently secure, when in fact it is not.
Between this and the inevitable AI-borne threats that will permeate our future, now more than ever, developers must hone their security skills and raise the bar for code quality no matter its origin.
The road to a data breach disaster is paved with good intentions
It should come as no surprise that AI coding companions are popular, especially as developers are faced with increasing responsibility, tighter deadlines, and the ambitions of a company’s innovation resting on their shoulders. However, even with the best intentions, a lack of actionable security awareness when using AI for coding will inevitably lead to glaring security problems. All developers with AI/ML tooling will generate more code, and its level of security risk will depend on their skill level. Organizations need to be acutely aware that untrained people will certainly generate code faster, but so too will they increase the speed of technical security debt.
Even our preliminary test (April 2023) with ChatGPT has revealed it will generate very basic mistakes that could have devastating consequences. When we asked it to build a login routine in PHP using a MySQL database, functional code was generated quickly. However, it defaulted to storing passwords in plaintext in a database, storing database connection credentials in code, and using a coding pattern that could result in SQL injection (although, it did do some level of filtering on the input parameters and spitting out database errors). All rookie errors by any measure:
Further prompting ensured the mistakes were amended, but it takes significant security knowledge to course-correct. Unchecked and widespread use of these tools is no better than unleashing junior developers onto your projects, and if this code is building sensitive infrastructure or processing personal data, then we’re looking at a ticking time bomb.
Of course, just like junior developers undoubtedly increase their skills over time, we expect AI/ML capabilities to improve. A year from now, it may not make such obvious and simple security mistakes. However, that will have the effect of dramatically increasing the security skill required to track down the more serious, hidden, non-trivial security errors it is still in danger of producing.
We remain ill-prepared to find and fix security vulnerabilities, and AI widens the gap
While there has been much talk of “shifting left” for many years at this point, the fact remains that, for most organizations, there is a significant lack of practical security knowledge among the development cohort, and we must work harder to provide the right-fit tools and education to help them on their way.
As it stands, we’re not prepared for the security bugs we are accustomed to encountering, not to mention the new AI-borne issues like prompt injection and hallucination squatting that represent entirely new attack vectors that are set to take off like wildfire. AI coding tools do represent the future of a developer’s coding arsenal, but the education to wield these productivity weapons safely must come now.
Table of contents
Chief Executive Officer, Chairman, and Co-Founder
Secure Code Warrior is here for your organization to help you secure code across the entire software development lifecycle and create a culture in which cybersecurity is top of mind. Whether you’re an AppSec Manager, Developer, CISO, or anyone involved in security, we can help your organization reduce risks associated with insecure code.
Book a demoDownloadResources to get you started
Benchmarking Security Skills: Streamlining Secure-by-Design in the Enterprise
The Secure-by-Design movement is the future of secure software development. Learn about the key elements companies need to keep in mind when they think about a Secure-by-Design initiative.
DigitalOcean Decreases Security Debt with Secure Code Warrior
DigitalOcean's use of Secure Code Warrior training has significantly reduced security debt, allowing teams to focus more on innovation and productivity. The improved security has strengthened their product quality and competitive edge. Looking ahead, the SCW Trust Score will help them further enhance security practices and continue driving innovation.
Resources to get you started
Trust Score Reveals the Value of Secure-by-Design Upskilling Initiatives
Our research has shown that secure code training works. Trust Score, using an algorithm drawing on more than 20 million learning data points from work by more than 250,000 learners at over 600 organizations, reveals its effectiveness in driving down vulnerabilities and how to make the initiative even more effective.
Reactive Versus Preventive Security: Prevention Is a Better Cure
The idea of bringing preventive security to legacy code and systems at the same time as newer applications can seem daunting, but a Secure-by-Design approach, enforced by upskilling developers, can apply security best practices to those systems. It’s the best chance many organizations have of improving their security postures.
The Benefits of Benchmarking Security Skills for Developers
The growing focus on secure code and Secure-by-Design principles requires developers to be trained in cybersecurity from the start of the SDLC, with tools like Secure Code Warrior’s Trust Score helping measure and improve their progress.
Driving Meaningful Success for Enterprise Secure-by-Design Initiatives
Our latest research paper, Benchmarking Security Skills: Streamlining Secure-by-Design in the Enterprise is the result of deep analysis of real Secure-by-Design initiatives at the enterprise level, and deriving best practice approaches based on data-driven findings.