Recently the Gamebytes team (@kanethomasDX, @alex_williamsa, @MoreJelly, and @JaredDowning_) launched a fun experiment called The CAPTCHAs. The CAPTCHAs are a showcase for the concept of on-chain verified CAPTCHAs, meant to help prevent bots from participating in NFT drops. Go to the site and you're given a Captcha. Get the Captcha right and you mint the Captcha as an NFT. Get the Captcha wrong and the transaction reverts. To make the drop more interesting, we tried to make the Captchas look artistic.
Here's an example of one:
Anyone that has participated in an NFT drop recently knows that bots are out of control. The typical flow of a hyped NFT drop these days includes bots buying large percentages of the drop, and then immediately dumping them on Opensea for quick profits. Economically this makes a lot of sense for the bots, but it sucks for the real people trying to participate.
The bots are exploiting the fact that the NFTs are underpriced. However, raising prices would price-out passionate community members, and there are certainly downsides to your whole community being only the highest bidders. Adidas knows that they can sell Yeezys for more than $220 each, but there are plenty of good reasons they don't. Recently Vitalik wrote an article titled Alternatives to selling at below-market-clearing prices for achieving fairness (or community sentiment, or fun).
The article focussed on novel ways in which NFT drops could be modified in order to allow more participants and ease gas wars. The core propositions are shifts in the financial engineering of the situation to get buyers in at the most correct market value for the item such that demand should be closer to supply, while also not pricing everyone out. All of his proposed methods are very interesting and should be explored, but they all start off with the same premise: "Each participant (verified by proof-of-personhood) ...". The site he links to in reference to proof-of-personhood essentially allows users to tie their identity to an address by submitting a video of themselves with their address. Every method to thwart bots is going to be game-able, but this method is game-able in a particularly bad way; it only requires a one time fixed human effort cost. This is a task I could pay a couple hundred dollars for many people on Fiverr to do, who will never care that someone is using their identity for this purpose. Sure, there is a lot of effort involved, but the payout of having hundreds of addresses treated as a white-listed real person on every NFT drop from now is worth many times that effort.
There is a lot to be learned from the sneaker community, who has dealt with this issue for quite some time. The first is that you must accept you're not going to stop the bots. The only thing you can do is make it harder for them over time. Right now what we have in the NFT community is the equivalent of if Adidas let anyone make an API request to their backend to buy a shoe on the next Yeezy drop. This would result in not a single real customer getting a shoe. The only reason real people can still even get NFTs during hyped drops is because there aren't enough bots buying, but as the market matures and more people realize the money to be made, this will change. In order to get the ball rolling on making it harder for these bots, we decided to start with the most basic form of person verification, the Completely Automated Public Turing Test to tell Computers and Humans Apart (CAPTCHA). While Web2 has advanced far past the humble Captcha, it's an easy win for Web3 to implement basic on-chain verified Captchas. No, these Captchas will not stop the shadowy super coder from pulling down the images and having them quickly solved by a group of people, but at least that's 100x more work than they have to do right now, and it looks a lot closer to what I have to do when I want to buy a bunch of Yeezys.
The goal of the this project was to put this idea out into the community to see if we can convince other projects to adopt similar methods. We've already had people in our DMs who run bots saying that this would make their lives harder, and even more encouraging have also received lots of great feedback on how to expand on this implementation. For the sake of more fully putting the idea out there I'd like to go over specifically how we made The CAPTCHAs. On a technical level, it's really nothing ground-breaking.
We wrote a javascript file that uses Canvas to draw shapes and letters and and returns an image to be uploaded to an off-chain host (we used AWS S3 in our case for the sake of quick prototyping, but it's the same process using something like IPFS). Another script then generated 10k random strings and fed them into the image creator. The Captcha's string and its token ID were used as seeds in the random generation of the properties of the image, such as the color palette, word obstructions, etc. After all the images had uploaded, all the solution strings were hashed using keccak256. Trying to store all of these hashes in the smart contract was where we hit our first barrier, realizing how expensive it is to store one kilobyte on the blockchain. Without getting too deep into the details, the cheapest way we figured was to store all of the hashes as a single Solidity bytes type and parse out the solution from the bytes based on the token IDs index when the verification happens. Still with 10k different hashes at 32 bytes each, this was going to cost way too much in storage, so we reduced each hash to only the first 2 bytes (4 characters). This technically makes it possible to scrape the input hashes and run a script in search of a string that just matches the hash. We didn't hype our launch up so there weren't people ready to break the doors down at a specific time, so the trade off of less bytes per hash was fine. For a more hotly anticipated drop, the extra cost of a couple Eth to add more bytes per hash should be trivial. On a technical level, nothing too crazy is going on here. When the user submits a transaction to the mint function, they send the token ID and the CAPTCHA solution. The only real difference from a standard NFT smart contract is that the CAPTCHA verification has to succeed before the mint goes through.
Because this space is very early, even the simplest of prevention methods will be enough to have most bots not waste their time, and just point their contracts at a drop that doesn't do this. That being said, there are some requirements / limitations involved with this this specific launch, but none that can't be worked through in the future. Here are some of the most obvious ones:
Token ID must be tied to CAPTCHA:
The way a lot of these random drops work is that what NFT maps to which token ID is not known until after the drop is done, "wen reveal" is a common term you'll see in NFT discords. Most drops that work like that don't have users minting specific NFT tokens, so users submit their transaction and just get whatever token ID is next once their transaction processes. For this implementation of the Captchas each Captcha has to be tied to a specific token ID, so the current UX of just submitting the transaction doesn't work. If two people submit the same Captcha solution the one who's transaction is processed second will fail, even if there are still tokens left to be minted, and the user will have to be served another Captcha to try again. While it does require a change in the UX, this implementation could be applied to these types of drops where you come to a webpage and user is served a random token ID that you do your best to distribute evenly by keeping track in a Web2 database of which Captchas are being served and to how many people. That could look something like this: Mirror all transactions to a Web2 database / server. This will allow you to know on the front-end whether there is already an inflight captcha request that has the correct solution, and you can refrain from serving that one, or if a user has already been served that one give them a new one. Your Web2 database could be gamed by someone who submits lots of transactions with correct captchas with very low gas, but you can also mirror estimated times and disregard transactions that are going to be past some time threshold. This is a bit more complex, but it's certainly an option! We're also still thinking of more straightforward ways to do this, preferably all on-chain.
Check if solution is correct without wasting gas:
Technically you can know prior to submitting the transaction if it's going to revert due to the Captcha being incorrect. This is actually not a bad thing, because you don't want your users wasting gas on incorrect attempts. As long as the bytes per hash is sufficiently long then a bot won't have enough time to find hash collisions anyway.
Front-running:
If transactions are publicly broadcast, bots can simply front-run transactions. In a naive implementation this is an issue, but simply using a private RPC or Flashbot would solve this.
We really loved seeing everyone enjoying the minting process so much; Everyone seemed to deeply understand what we were trying to do and why it was cool, which was awesome! Here is a particularly good thread from @jonwu_.
And here is a much appreciated shoutout from @kevinrose
Ultimately we would just love to see more drops try to make bots lives harder, so if you're someone involved in NFT drops please get in touch! We'd like to see approaches like this more widely adopted and would love to talk.