Today Quantopian is publicly releasing the results of our internal performance test framework. You can see the latest results at http://www.quantopian.com/performance, updated with each commit to our code repo.
I'm excited that Quantopian is rallying around system performance as a team and that we are releasing results that show how much work we need to do. If you’re passionate about system performance, we’re hiring. We’d also welcome PRs to Zipline, our open-source backtesting engine.
I enjoy working on improving system performance because it defines system quality. Stability, the user experience, scalability, and developer productivity all stem from performance and performance improvements bubble up through all layers of the system. At Quantopian, faster backtests very directly mean that our users can run more tests to ensure their algorithms are functioning properly.
While building our live trading capabilities, I worked on a prototype that switched our equity minute data source from using documents stored in Mongo to bcolz, a file-based data source. I tested and measured the prototype using the existing tools we had developed to get profiling results of our backtesting engine (like Scott Sanderson's great https://github.com/ssanderson/pstats-view). Jean Bredeche, our CTO, loved the prototype results, so we set up a project to convert Quantopian’s production and development infrastructure to use bcolz.
Jean asked me to present this project to the team. Preparing that presentation forced us to think of a clear way to measure and present Quantopian’s backtesting performance. Solving that communication problem proved as challenging as coding and releasing the bcolz improvement. After all, Zipline and Quantopian are designed to run user code. The incredible flexibility we provide for our users makes evaluating system performance difficult, especially so because we never look at user algorithm source code without explicit permission.
Together with Jess Stauth, our lead quant, I designed a set of test algorithms and ran simulations with our different data stores, which let me plot comparisons. These plots really captured people’s attention, and our developers started to ask me to run the simulations on their code branches to check for performance regressions, or to prove ideas for speed improvements.
We started to talk about our development culture and the disconnect between the value we place on performance and the investment we make in improving it. Performance is just as important as correctness. For correctness, we run continuous integration, and maintain a huge suite of tests. Any platform code changes are automatically run in parallel to the prior release, and we check that simulation results match.
We make a huge and ongoing investment in the correctness of our system. We reasoned that performance needed the same continuous measurement and the same feedback loop for our developers. We developed a suite of synthetic algorithms that let us stress test different parts of our backtester, such as universe size, buying frequency, and history window length. We will be continuously adding to our suite of test algorithms, and welcome suggestions for new ones.
I'm proud to say that the performance test framework is now part of our development process, and as a result, performance improvement is part of every code push.
About 4 months ago, we announced that we were hard at work building a hosted research platform for analyzing our curated datasets, your Quantopian algorithms, and your backtest results. We've been making great progress, and currently have over 40 alpha users on the new environment and they are helping us to improve it every day. We aim to have the platform available for everyone within the next few months. In the meantime, we wanted to give a sneak peak into it's capabilities to show you it can help you create and explore your ideas. We have shared an example notebook in the community.
In this example, we walk through the process of exploring and understanding an external dataset. We then use that understanding to optimize a trading strategy. The external dataset is provided by EventVestor, a financial data and intelligence platform for investors. EventVestor aggregates event-driven data and provide a multitude of analytics services on it. In this notebook, we analyze if share buybacks are an indicator of drift and optimize a strategy for investing based on share buybacks.
In the community, you'll be able to view the notebook in it's entirety and see the related backtests. Once the research platform is ready for prime time, we'll add the ability to clone the notebook (like you can do today with an ago) so you can experiment on your own.
If you haven't signed up to be a Research Beta user yet, now is a great time. We expect to be adding more users in the next few weeks.
We ship a lot of code here at Quantopian, with a lot of new features. Some of those features get top billing and lots of space on the website: launch of the Quantopian Open, addition of fundamental data from Morningstar, set up of the IDE quick chat, development of the Managers Program, and others. But, other improvements we make don't get the same level of attention. Some of these improvements are quite nifty - making using Quantopian easier, faster, and more reliable. We plan on sharing them periodically on the blog. Here is a list of our latest news, features, and tools:
Tools and Features:
Moving forward, we promise to post a monthly summary of all our updates here for easy consumption. If you would like up to the minute updates, subscribe to our RSS feed and follow us on Facebook, Twitter, and Linkedin.
Last month, Quantopian introduced a powerful new feature: programmatic access to fundamental data from Morningstar in the backtester. It is yet another piece of the Quantopian platform that is leveling the algorithmic investing playing field.
Since the announcement, the response from the Quantopian community has been phenomenal with thousands of backtests already run using the data. Whole new classes of investment strategy, like quantitive value investing, are now more easily executed in Quantopian.
In tandem with the announcement, we made a special offer. The community members who post the best algorithms that use fundamentals to our forums by January 1 would win an additional 12 months of free access to the fundamental data.
With the deadline past, we’ve got seven community members who have earned the additional 12 month prize. Here are the winning posts, algorithms and authors:
If you'd like to learn more about fundamentals, check out the winning algorithms or simply read the original forum post announcing the availability of the data and clone the algorithm there. Or, sign up to attend our upcoming webinar which will teach the basics of using fundamentals inside Quantopian.
Congratulations to all our winners!
Starting today, Quantopian community members can programmatically access Morningstar's corporate fundamental data.
Quantopian's comprehensive historical fundamental data API is unprecedented in the industry. For the first time ever, individual investors can build fundamentally driven investment algorithms.
With access to the fundamentals data within your algorithm, you can use it to define your investible stock universe. Want your algorithm to only focus on stocks with a market cap over $1B? You can do it now. Filter stocks by PE ratio? By dividends? By EPS? All possible with Quantopian. The Quantopian IDE makes it easy to search for the right fundamental metrics for your algorithm.
We’ve taken care of the heavy lifting for you: Morningstar's company identifiers are mapped to Quantopian's security identifiers and the API includes 'knowledge date' indexing to avoid look-ahead bias. I can’t imagine an easier way to programmatically work with fundamentals.
Check out our simple example and documentation of the new methods for accessing and incorporating fundamentals into your code.
In tandem with this news, I'm delighted to announce two special offers:
1. As of January 1st, every registered user of Quantopian, will be guaranteed 6 months of complimentary access to the fundamental data in the Quantopian backtester. So if you've been lying in wait, now is the time to register for free. And if you have friends who are fundamental investors, please pass along the news!
2. Share your coolest fundamentals-based algorithm to the community forum before January 1st for a chance to win an additional 12 months of free access. Justin Lent, Quantopian's new director of fund development, will review submissions and select the best sample algos to be highlighted on our blog in January. Submit your entry by January 1st and tag it with #Fundamentals in the post title to be considered.
We can't wait to see the burst of creativity in the community that this is surely going to unleash.
Get started on Quantopian today.
By: Thomas Wiecki
Assessing performance as well as risk are key issues in quantitative finance. While there exist a large number of metrics like the Sharpe ratio to quantify this trade-off between reward and risk in a portfolio or an asset, there is an orthogonal source of risk – uncertainty in the estimation of these metrics themselves. As every measure is only an estimate based on limited and noisy data, quantifying the level of certainty we have into metrics like the Sharpe ratio becomes important.
Quantopian allows users to research, develop and launch trading algorithms that invest in the stock market. Recently, we also announced that we are developing a hedge fund - sourced from the top-performing quants in their community. In order to identify stellar quants and connect them with investors, estimating performance with few data points has become critical. Bayesian modeling is a flexible statistical framework well suited for this problem as uncertainty can be directly quantified in terms of the posterior distribution.
In my latest meetup series, I gave an overview of Bayesian statistics and how Probabilistic Programming frameworks like PyMC can be used to build and estimate complex statistical models. I then showed how several common financial risk metrics like the Sharpe ratio can be expressed as a probabilistic program to yield a Bayesian Sharpe ratio that includes a distribution of possible Sharpe values, each associated with a probability. I applied this type of Bayesian data analysis to evaluate the performance of anonymized real-money trading algorithms running on Quantopian.
You can also view the content via this slide share.
by Seong Lee
Earnings estimates (earnings per share or EPS) and revenue estimates are heavily used in both quant and fundamental stock analysis as forward-looking indicators of stock performance and sources of alpha. Traditionally estimates are given by sell-side analysts on Wall Street and are then aggregated and averaged into what is commonly referred to as "the Wall Street Consensus". In 2011, the financial landscape changed when fintech startup Estimize launched their new platform allowing anyone on the web to share their own predictions on earnings and revenue estimates. Those estimates are then compared to what sell-side wall street analysts think, and compared to what the actual results are. Website visitors and contributors can browse the estimates submitted by other users.
We recently hosted a NYC Algorithmic Trading meetup where we discussed the potential of crowd-sourced earnings estimates as the basis for new and interesting trading strategies. I presented some validation work on claims made in a recent Estimize whitepaper with the goal of replicating the results in the new Quantopian Research Platform and providing a basis for future work. My work confirmed their previous finding - there is potential for crowdsourced earnings data to be an interesting new source of alpha, especially given that current earnings surprise strategies are almost exclusively based off the Wall Street Consensus.
The meetup clips below will give you an overview on Quantopian's latest news and also show you examples on how to incorporate multiple sources of earning predictions into your algorithms.
Quantopian's Latest News
Karen Rubin, director of product, gives an overview of our latest initiatives: quant-sourced hedge fund, the addition of fundamental data into our platform and our new research environment.
In this clip, I dive into Estimize, how their crowd-sourced estimates work and then shows my validation work against the claims in their latest white paper.
Sample Post Earnings Announcement Drift Trading Strategy
I walk through a basic PEAD (Post Earnings Announcement Drift) Strategy - a strategy that goes long (short) companies whose actual earnings announcements beat (miss) expectations, also knows as a positive (negative) earnings "surprise".
I did some pre-processing to the raw data files provided by Estimize on the Quantopian Research Platform (mainly computed the Estimize consensus by averaging all the individual estimates for each reporting date) before creating the strategy described above.
Just getting started with Python? With its extremely rich ecosystem of data science tools it can be overwhelming to newcomers. In this post, I explore how to navigate and leverage the PyData jungle - by focusing on the 10% of tools that allow you to do 90% of the work. The tools I will introduce here will allow you accomplish most things a data scientist does in his day-to-day (i.e. data i/o, data munging, and data analysis).
Maximize your efficiency - click here to read the full blog post and learn more about the following tools: installation, IPython notebooks, pandas, and Seaborn.
We are hard at work building a new tool: a hosted research environment for analyzing our curated datasets, your Quantopian algorithms, and your backtest results.
For those of you who followed Quantopian's progress the last few years, you have seen our offering evolve from a backtesting platform, to a backtesting and live trading environment.
And now we're ready for the next step. During the course of building both of these features we have gotten lots and LOTS of requests for more flexible data access, for the ability to do custom plotting, and post-hoc analyses on backtest results. So we're creating a whole new environment to support iterative research and data exploration.
We chose the IPython notebook as the backbone for this platform and we have just gotten started building basic APIs to access our 12+ year minute (or daily) bar pricing data set.
We are very excited about the feedback we've received and have decided to open up beta registrations. If you are interested in being a beta user for the research environment, sign up here or register to attend a sneak peek webinar on Tuesday September 30th to learn more.
A few weeks ago Quantopian welcomed Matt Trudeau, Head of Product at IEX, to speak at the NYC Algorithmic Trading Meetup. IEX is an Alternative Trading System (ATS) launched in October of 2013 and featured in Michael Lewis’s latest blockbuster novel, Flash Boys. Matt is part of the brain trust that built this new exchange and designed its innovative market structure.
Soon after the Flash Boys release there was a flurry of press and debate about the state of the markets and IEX. My favorite pieces included a live debate on CNBC and the 60 Minutes profile on IEX founder Brad Katsuyama. The debate was heated and contentious, and it ranged from the merits of 'high frequency trading' to allegations of market rigging and order front-running.
Not surprisingly, our Meetup comment thread quickly became a hotbed of discussion of the intricacies (or existence) of latency and cross-market arbitrage, with members sharing a diversity of strong opinions on our decision to give Matt and IEX the floor for a meetup. As is so often the case, cooler heads prevail at in-person events far more so than in online forums - the Meetup was lively but polite and focused. We were happy that Matt met a respectful and engaged audience of quants, traders, computer scientists and investors who were as eager to listen and learn as they were to ask tough questions. Our meetup sponsors from the CQF program were kind enough to record the full presentations by both Matt and a pre-session by Quantopian's founding CEO John Fawcett; you can watch the full meetup in several segments below.
One of the questions I heard leading up to the Meetup was, "How did you get Matt as a speaker?" The answer is pretty simple: he responded to our Tweet about Quantopian's ability to route orders through Quantopian to IEX. We wrote about the feature on our blog and posted the news on Twitter, which led to an invitation to visit the IEX team at their offices in NYC – and eventually to the idea of hosting a meetup so that the NYC Algorithmic Trading group could enjoy the same chance we had to hear about the technology and philosophy behind the group's mission of 'institutionalizing fairness.'
Introduction: Quantopian founding CEO John Fawcett gives an overview of Quantopian's backtesting and live trading platform, company business model and near term roadmap.
Part 1: IEX Head of Product Matt Trudeau. Background on how and why IEX was started, including a primer on the ecosystem of computerized trading algorithms, from passive market making through structural arbitrage.
Part 2: IEX Head of Product Matt Trudeau. Discussion of IEX's price matching engine and continuation of discussion on how IEX's approach interacts with various computerized trading algorithms.
Part 3: IEX Head of Product Matt Trudeau answers the question "How is IEX doing as a company?" Spoiler alert, IEX hit an intraday high water mark of 1% of market share for the first time on the day of our meetup (8/26).
If you are interested in routing your Quantopian real-money trades to IEX, some sample code is below, and you can read more in the API documentation. Or, work with a full algorithm that uses IEX by cloning the sample algorithm in our forums.
# Import IEX exchange routing from brokers.ib import IBExchange def initialize(context): context.stock = symbol('AAPL') def handle_data(context, data): # Buy 1000 shares of Apple via IEX order_target(context.stock, 1000, style=MarketOrder(exchange=IBExchange.IEX))