CAPTCHAs are the annoying challenges we sometimes see when we visit websites. These days, some of the most common ones ask you to identify every square that has a traffic light in it, every square with a bicycle, or something along those lines. Earlier CAPTCHAs were often a bunch of squiggled letters and numbers. You almost had to squint to figure them out.
CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart, and they are basically just Turing tests. Websites use them to tell the difference between humans and bots. The theory behind them is that if we challenge a website visitor to do something that only a human can do, then only those who pass the challenge are humans. Websites can let in those who pass the test, and block those who fail.
Why do we want to keep bots out anyway?
Bots are just pieces of software that do automated tasks. A lot of bots are incredibly useful, such as web crawlers, or a company's customer service chatbot. When we are talking about the kinds of bots that cause problems, we generally mean malicious bots. These are scripts programmed by bad actors to cause a range of harms, which can include:
- Denial of service attacks
- Using up service provider's resources (this can increase the costs of the service for end users)
- Amplifying posts
- Creating fake views
- Spreading disinformation
- Spam advertisements
- Scams
- Directing people to malicious websites
These types of activities tend to make the internet worse, and it’s in both users' and websites’ interest to limit them. That's why it's so important for us to have techniques like CAPTCHAs that can help to limit bots.
Are we seeing the last days of CAPTCHAs?
As Turing tests, it's the job of CAPTCHAs to be able to tell humans and bots apart. For a CAPTCHA to work, it needs to be able to ask a website visitor to do something that humans can do, but bots can't. Historically, this has been fairly easy, because there was a huge difference between the abilities of humans and the most cutting edge machines.
Over time, computers have become a lot more capable, which has meant that they are able to complete a lot of activities that were formerly reserved for humans. To combat this, we have had to make CAPTCHAs more complicated over the years, which is why you don't see a lot of the old CAPTCHAs with weird letters anymore. Instead, you come across the annoying requests to check boxes with traffic lights in them, because these are more challenging for computers to successfully complete.
With the recent rise of ChatGPT and other AI technologies, we are seeing huge leaps forward in what computers can do. It seems like the logical reaction should be to make our CAPTCHAs more complicated, right?
Unfortunately, this isn't so easy. The harder you make CAPTCHAs, the more it shuts out certain users, particularly those with disabilities. As the power of these AI models grow, we would have to make our CAPTCHAs so incredibly difficult that a lot of users would struggle to access online services, or even end up cut off from them completely.
If the capabilities of AI continue to grow, we will get to a point where CAPTCHAs are no longer a suitable means for blocking robots, because they will block too many humans as well. Depending on how things play out, that point could be fast approaching.
So what options do we have?
If we continue to use existing CAPTCHAs in a world of advanced AI, we will end up with an Internet where bots are rampant, causing all of the harms mentioned above. That's probably not a desirable outcome, so we need to come up with some solutions.
Unfortunately, many of the proposed solutions aren't great either:
Registering with government IDs
One option would be for people to upload their government IDs whenever they want to sign up to a website. This could kill the anonymous internet, or at least the anonymous usable internet. There could still conceivably be a Wild West anonymous internet, but it might be so overrun with bots that few would dare to venture there.
Registering with phone numbers
Another option would be for websites to require your phone number upon signup. Many countries have somewhat strict regulations on registering SIM cards, so the difficulty of setting up SIM cards could also help to deter bots. This tactic could also compromise the anonymous internet.
Requiring payment
A third possibility is for more websites to require payment. Payment on its own doesn't necessarily eliminate bots, but it can limit their scale, because it simply costs bad actors too much to launch an army of bots. If a website required unique payment information for each user, this could restrict bots even further. Anonymous cryptocurrencies could be another privacy preserving option as well.
Something new???
There's a fourth type of approach that seems interesting, exemplified by Worldcoin's World ID, which is worked on by OpenAI's Sam Altman, among others.
The short version of the proposed solution is that everyone would have a hardware scanner that checks their biometrics to verify their unique humanness. It then uses zero-knowledge proofs so that users can share proof of this uniqueness, without revealing the user’s identifying information. This would allow websites to check whether we are human, without them having to know our personal details. Under a scheme like this, websites could let humans in without getting overrun by bots, since bots don't have biometrics.
We by no means endorse this particular solution. We have no affiliation with Worldcoin, we have not tried it, we haven't thoroughly examined its privacy and security practices, nor do we know if the organization is capable of helping us solve such an important issue.
But World ID's approach is an interesting way of thinking about the problem. If something along these lines were to be successful, it may be a decent compromise that would allow us to preserve the anonymous internet, without it becoming overrun by bots.
Yes, people may not trust the cryptography and they may not want to input their biometrics into the device—these are serious issues. But if we get to the stage of sufficiently advanced AI, we may only have about four options:
- Letting the internet get overrun by bots.
- Handing over our personal information in a non-privacy preserving manner.
- Paying for our services, which could also involve handing over our personal information.
- Handing over our biometrics in a way that does aim to preserve our privacy.
None of these are great options, but the fourth option is at least worth investigating to see if we can come up with a solution that helps to limit bad outcomes.
Does anyone have any better ideas?