A natural outgrowth of what we do at Quantopian is that our users tend to be technically sophisticated and love data. Therefore, when we have a significant event, such as the recent security breach, we are as open and transparent as possible about what transpired. In that spirit, we present this detailed analysis of the security breach and what we’ve since done to strengthen our site security.
In presenting such an analysis, there is a fine line between between being sufficiently open and transparent with our users and with the maintainers of other web sites, who might benefit from our lessons learned, and providing information that could aid future attacks. While we don’t believe in relying on security by obscurity, we also don’t think it’s a good idea to tell the bad guys exactly where to focus their efforts. Therefore, some details have been omitted.
Essential to our mission as an advanced algorithmic trading platform is the fact that our servers execute arbitrary Python code written by our users. Therefore, while we share with every other web application the requirement to code our application securely, we have an additional, challenging requirement most sites don’t: preventing our users’ code from compromising our security. This makes security more challenging and also makes our site a more attractive target.
We run users’ algorithms in a sandbox which is isolated from the rest of our application in numerous ways, including:
- separate process;
- separate Python namespace;
- limits on available Python modules; and
The algorithm sandbox obviously cannot be entirely isolated from the rest of our application, because we need to send pricing and universe data to the algorithm and receive in return the logging data and results that it generates.
Malicious users attempt on a regular basis to break out of our algorithm sandbox. We actively monitor them, evaluate them on a case-by-case basis, and take additional steps when necessary.
On the afternoon of Thursday, November 14, our monitoring systems alerted us to the fact that an attacker was attempting to escape from the sandbox using a technique we knew about and had already blocked. Many other users had tried the same technique unsuccessfully, so we weren’t particularly worried.
However, two things about this particular attacker surprised us. First of all, he was more persistent than most; he tried pretty much every conceivable variety of this particular attack, when most attackers give up after it fails a few times. Second, because of his persistence, he actually found a minor chink in our armor: he was able to get a peek at some internals of our sandbox due to a typographical error in one of our source files.
The information he was able to retrieve was relatively inconsequential, but we we were obviously alarmed by the unintended exposure of that data. We immediately blocked the attacker’s access to the site, tracked down the root cause of the vulnerability, fixed it, and released the fix.
Shortly after we blocked the attacker, he circumvented the block, returned to the site, and kept working on the attack, so we blocked him again. After that, he tried unsuccessfully for a while to get back into the site, and then went away. We thought that was the end of it, but unfortunately we were wrong.
On Friday morning, the attacker returned. He tried several techniques unsuccessfully, but then he found one that worked, a method for partially circumventing the limitations on Python modules accessible from user algorithm code. We had known about this particular vulnerability before his attack, and it was on our roadmap to fix soon, but we hadn’t yet done so because we thought that it was limited in scope and it was unlikely that anyone would find it. With 20/20 hindsight, we were clearly wrong.
After detecting the attacker’s return, we began to analyze what he had been able to accomplish, and it quickly became clear that the breach was significant. We shut down the application, notified our users on our blog, and set to completing our analysis to determine the precise magnitude of the compromise and what needed to be done to eliminate the vulnerabilities that had enabled it.
Within an hour of detecting the compromise, we were able to identify the root cause and a remediation plan which we immediately began implementing. Within a few hours of detecting the compromise, we were able to confirm that the attacker had not accessed any user data.
We left the application shut down all day Friday and Saturday and most of the day Sunday while we implemented and tested the enhancements needed to prevent this kind of attack in the future. After developing and testing the fixes, we brought the site back up at around 8:00pm on Sunday night.
The attacker made several more unsuccessful attempts to compromise the site on Sunday night.
On Monday morning, we had a smooth market open for our live traders and paper traders.
We continue to aggressively monitor the application.
Before this breach, access to Python modules and methods within using algorithm code was limited by three different mechanisms:
- Any module not already imported into the Python namespace before executing the user’s algorithm could not be used within the algorithm, since after the chroot() the files necessary to import said module are no longer available.
- Import statements were filtered and only certain, whitelisted modules could be imported.
- Keywords for potentially dangerous methods and module attributes were banned.
The flaw in this logic, which the attacker discovered, is that if a whitelisted module imports a module that is not whitelisted, then that module is accessible as an attribute of the whitelisted module’s object. For example:
>>> import pytz >>> import sys >>> print sys <module 'sys' (built-in)> >>> print pytz.sys <module 'sys' (built-in)> >>>
We couldn’t add every single module name to the keyword blacklist, since that would have prevented users from using too many variable names. And even if we had been able to do that, a module could also import a blacklisted module under a different name, e.g., “import sys as sysmodule”, and we would have had to fully audit all whitelisted modules for such references, as well as re-auditing them each time we upgraded our module versions, which wasn’t practical.
The attacker accessed the sys module through another, whitelisted module and used it to gain access to other Python modules containing sensitive data. Fortunately, however, he did not access any user data. He also didn’t get our data encryption keys, which we guard extremely carefully for obvious reasons, so even if he had accessed user data he would not have been able to decrypt it.
How we fixed it
We’ve made a number of changes to our application to eliminate this particular vulnerability and mitigate the risk of other vulnerabilities as well. These include:
- Our module whitelist / blacklist functionality is much more powerful now and includes both compile-time and run-time enforcement. With this change there is no longer any way -- that we know of -- for algorithm code to break out of the algorithm sandbox.
- Nevertheless, since good security is all about layers, we’ve removed much of the sensitive data that could be accessed by code that does somehow manage to escape the sandbox, and we are planning additional changes to remove even more.
- We now have automated processes in place to detect when new attributes have been added to a module we allow, so that we can audit them and determine whether they should be whitelisted or blacklisted.
- We’ve enhanced our monitoring to alert us more quickly and more aggressively to suspicious user algorithm code, and we are continuing to expand and improve our monitoring and alerting capabilities.
In closing, we first and foremost want to reiterate that no user data were compromised during this incident.
We take very seriously our responsibility to safeguard our members’ intellectual property. Security is an ongoing process, not a one-time thing, and we will continue to evolve our practices to stay current with the state of the art. We believe the best way to earn your trust is to be open and transparent, and we will continue to do that even when we are sharing unpleasant news.
If you have any questions or concerns, please let us know. We always reply to email received at [email protected]. We monitor [email protected] for emails concerning our security. You are always welcome to reach me personally at [email protected].
Vice President of Operations