[PDF version of this blog post]
"You will have to close source and commercialize Sage. It's inevitable." -- Michael Monagan, cofounder of Maple, told me this in 2006.SageMathCloud (SMC)
is a website that I first launched in April 2013, through which you can use Sage and all other open source math software online, edit Latex documents, IPython notebooks, Sage worksheets, track your todo items, and many other types of documents. You can write, compile, and run code in most programming languages, and use a color command line terminal. There is realtime collaboration on everything through shared projects, terminals, etc. Each project comes with a default quota of 5GB disk space and 8GB of RAM.SMC
is fun to use, pretty to look at, frequently backs up your work in many ways, is fault tolerant, encourages collaboration, and provides a web-based way to use standard command-line tools.
The goal of the SageMath software project, which I founded in 2005, is to create a viable free open source alternative to Magma, Mathematica, Maple, and Matlab. SMC
mathematics software -- instead, SMC
is best viewed by analogy as a browser-based version of a Linux desktop environment like KDE or Gnome. The vast majority of the code we write for SMC
involves text editor issues (problems similar to those confronted by Emacs or Vim), personal information management, support for editing LaTeX documents, terminals, file management, etc. There is almost no mathematics involved at all.
That said, the main software I
use is Sage, so of course support for Sage is a primary focus. SMC
is a software environment that is being optimized for its users, who are mostly college students and teachers who use Sage (or Python) in conjunction with their courses. A big motivation for the existence of SMC
is to make Sage much more accessible, since growth of Sage has stagnated since 2011, with the number one show-stopper obstruction being the difficulty of students installing Sage.
Sage is Failing
Measured by the mission statement, Sage has overall failed. The core goal is to provide similar functionality to Magma (and the other Ma's) across the board, and the Sage development model and community has failed to do this across the board, since after 9 years, based on our current progress, we will never get there. There are numerous core areas of research mathematics that I'm personally familiar with (in arithmetic geometry), where Sage has barely moved in years and Sage does only a few percent of what Magma does. Unless there is a viable plan for the areas to all
be systematically addressed in a reasonable timeframe, not just with arithmetic geometry in Magma, but with everything in Mathematica, Maple., etc, we are definitely failing at the main goal I have for the Sage math software project.
I have absolutely no doubt that money combined with good planning and management would make it possible to achieve our mission statement. I've seen this hundreds of times over at a small scale at Sage Days workshops during the last decade. And let's not forget that with very substantial funding, Linux now provides a viable free open source alternative to Microsoft Windows. Just providing Sage developers with travel expenses (and 0 salary) is enough to get a huge amount done, when possible. But all my attempts with foundations and other clients to get any significant
funding, at even the level of 1% of the funding that Mathematica gets each year, has failed. For the life of the Sage project, we've never got more than maybe 0.1% of what Mathematica gets in revenue. It's just a fact that the mathematics community provides Mathematica $50+ million a year, enough to fund over 600 fulltime positions, and they won't provide enough to fund one single Sage developer fulltime.
But the Sage mission statement remains, and even if everybody else in the world gives up on it, I HAVE NOT
is my last ditch strategy to provide resources and visibility so we can succeed at this goal and give the world a viable free open source alternative to the Ma's. I wish I were writing interesting mathematical software, but I'm not, because I'm sucking it up and playing the long game.
The Users of SMC
During the last academic year (e.g., April 2014) there were about 20K "monthly active users" (as defined by Google Analytics), 6K weekly active users, and usually around 300 simultaneous connected users. The summer months have been slower, due to less teaching.
Numerically most users are undergraduate students in courses, who are asked to use SMC
in conjunction with a course. There's also quite a bit of usage of SMC
by people doing research in mathematics, statistics, economics, etc. -- pretty much all computational sciences. Very roughly, people create Sage worksheets, IPython notebooks, and Latex documents in somewhat equal proportions.
What SMC runs on
Technically, SMC is a multi-datacenter web application without specific dependencies on particular cloud provider functionality. In particular, we use the Cassandra database, and custom backend services written in Node.js (about 15,000 lines of backend code). We also use Amazon's Route 53 service for geographically aware DNS. There are two racks containing dedicated computers on opposites sides of campus at University of Washington with 19 total machines, each with about 1TB SSD, 4TB+ HDD, and 96GB RAM. We also have dozens of VM's running at 2 Google data centers to the east.
A substantial fraction of the work in implementing SMC
has been in designing and implementing (and reimplementing many times, in response to real usage) a robust replicated backend infrastructure for projects, with regular snapshots and automatic failover across data centers. As I write this, users have created 66677 projects; each project is a self-contained Linux account whose files are replicated across several data centers.
The Source Code of SMC
Some of the functionality of SMC, for example Sage worksheets, communicate with a separate process via a TCP connection. That separate process is in some cases a GPL'd program such as Sage, R, or Octave, so the viral nature of the GPL does not apply to SMC. Also, of course the virtual machines are running the Linux operating system, which is mostly GPL licensed. (There is absolutely no AGPL-licensed code anywhere in the picture.)
Note that since none of the SMC server and client code links (even at an interpreter level) with any GPL'd software, that code can be legally distributed under any license (e.g., from BSD to commercial).
Also we plan to create a fully open source version of the Sage worksheet server part of SMC for inclusion with Sage. This is not our top priority, since there are several absolutely critical tasks that still must be finished first on SMC, e.g., basic course management.
The SMC Business Model
The University of Washington Center for Commercialization (C4C) has been very involved and supportive since the start of the projects. There are no financial investors or separate company; instead, funding comes from UW, some unspent grant funds that were about to expire, and a substantial Google "Academic Education Grant" ($60K). Our first customer is the "US Army Engineer Research and Development Center", which just started a support/license agreement to run their own SMC internally. We don't currently offer a SaaS product for sale yet -- the options for what can be sold by UW are constrained, since UW is a not-for-profit state university. Currently users receive enhancements to their projects (e.g., increased RAM or disk space) in exchange for explaining to me the interesting research or teaching they are doing with SMC.
The longterm plan is to start a separate for-profit company if we build a sufficient customer base. If this company is successful, it would also support fulltime development of Sage (e.g., via teaching buyouts for faculty, support of students, etc.), similar to how Magma (and Mathematica, etc.) development is funded.
In conclusion, in response to Michael Monagan, you are wrong. And you are right.