For many employees, ChatGPT and other AI models are just too tempting. Why spend all day writing that report when ChatGPT could do the bulk of it in seconds? Why stare at the wall for hours trying to come up with graphic design ideas when Midjourney can spit out new concepts in a flash?
Some companies might not care too much, as long as the final output is good. They may even encourage AI models to increase employee efficiency. But there are serious issues that go beyond employees slacking off. When it comes to security, one of the main concerns comes from employees entering sensitive company data into these AI models.
How serious is the threat to company data?
3.1% of employees have pasted confidential company information into ChatGPT, according to Cyberhaven, which runs a data detection and response service. Cyberhaven analyzed 1.6 million workers whose companies use its product, and found that 8.2% of knowledge workers had tried ChatGPT at work at least once.
In one week, per 100,000 employees, Cyberhaven found:
- 199 incidents of sensitive/internal only data leaked to ChatGPT
- 173 incidents of client data leaked to ChatGPT
- 159 incidents of source code leaked to ChatGPT
- 102 incidents of personally identifiable information (PII) leaked to ChatGPT
- 94 incidents of personal health information (PHI) leaked to ChatGPT
- 57 incidents of project planning files leaked to ChatGPT
Now, we have to take Cyberhaven's figures with a grain of salt, because it's in the business of selling a data detection platform. But even if the numbers are off by a magnitude of 10, tools like ChatGPT are still a huge risk in the workplace.
What happens when Samsung employees use ChatGPT?
Let’s use Samsung’s experience with ChatGPT as a case study. Samsung initially allowed its engineers to use ChatGPT, but they ultimately ended up inputting valuable company information. Within a month, the company recorded three instances of sensitive information being leaked through ChatGPT.
In one instance, an employee used ChatGPT to convert confidential notes from a meeting into a presentation. Another breach involved the employee submitting source code to ChatGPT.
Company secrets like meeting notes and source code should be closely guarded. If they fall into the hands of competitors, a company can lose its edge. If attackers get their hands on this type of information, they can sell it on the darknet or blackmail the company.
In response, Samsung has since restricted ChatGPT queries to just 1024 bytes, and warned employees of the dangers. The company is also developing its own internal AI to assist employees without the data risks.
How do OpenAI and its competitors process data?
According to OpenAI's FAQ, data submitted through ChatGPT or DALL-E "...may be used to improve our models." This presumably means that the company stores the data and uses it for training purposes. However, OpenAI does have a form for opting out, and you can also request for data to be deleted.
The FAQ also states that "Data submitted through the OpenAI API is not used to train OpenAI models or improve OpenAI’s service offering." So, if you are using the API, then the data will not be used to train future models. However, the data is retained for up to 30 days.
Whether or not this usage policy is a problem will depend on the type of information sent to OpenAI. Examples of dangerous scenarios include:
- An executive submitting top secret plans to ChatGPT. This could result in sensitive company data leaking.
- A doctor asking ChatGPT to write a referral letter for a patient that includes the patient’s name and medical information. This could be a breach of HIPAA regulations.
On the other hand, your company probably won't run into any data security issues if an intern asks ChatGPT "How do I run a Facebook ad campaign". This means that while there are risks related to certain types of information, you do not need to fear all use cases.
Note that other AI models will have their own data usage policies and it's your company's responsibility to understand them and direct employees appropriately.
How can companies protect their data against AI misuse?
Your company has two major choices for navigating the minefield of company data and AI models:
Completely restricting usage
One option is to completely block the use of ChatGPT and its competitors. This is a good option for companies (or just specific employees) who deal with a lot of sensitive data. In these scenarios, the risks are high, and it's easy for someone to inadvertently leak data while using these models. Employees may not even realize that they accidentally submitted sensitive data until it’s too late.
Blocking these tools can be seen as extreme, because it means that your company cannot take advantage of their powers. However, there are certainly a lot of scenarios where it may be warranted, such as in finance and healthcare. Your company could restrict models like ChatGPT by blocking the websites, or you could explicitly prohibit their use through employee contracts.
Establishing an AI usage policy
If your company determines that banning these tools is a step too far, then setting up an AI usage policy is a good option. This would allow it to still see the efficiency gains from these tools, while mitigating some of the risks.
A usage policy can't completely stop employees from inputting sensitive data, but it can give them guidance, and help to reduce the number of serious incidents. It would be best to accompany a usage policy with employee training, so that employees become confident in what they can and cannot use these tools for.
With a comprehensive AI usage policy, your company can get some of the upsides from this technology, while limiting its risks.