Open Source vs. Proprietary
What is open source?
The Open Source Initiative defines 10 criteria that a piece of open source software must comply with to call itself truly open. They boil down to allowing you to do anything you like with the executable program, and pretty much anything you like with the code, with the exception that you may not rip off the author by passing their code off as your own. It's important, however, to distinguish between open source systems and 'free' systems. For example, quite a lot of software and services today provide some sort of 'free' layer, but in exchange, you are not permitted to do anything with the underlying code (or data), and in most cases, you are effectively giving up your personal information for that company to use for profit. Google Maps, for example, is widely perceived to be 'free' in that way - it doesn't technically take money from you directly to use it, but your interactions with the service are captured and re-used to derive profits for Google. It's also definitely not Open Source - you have no ability to access the source code or to modify and/or re-use their maps beyond what they explicitly permit.
The full set of criteria that define open source according to the OSI may sound complex, but have you ever read the ArcGIS License? It is pretty typical of commercial licenses. It runs to 7 pages (plus 9 pages of footnotes!) and seems to boil down to: You can use one copy of the software for a period of time, and if it destroys your computer (and/or business) you may be able to recoup the cost of your license (of course, we're not lawyers, so you should check with one before agreeing to any license). And if you are curious whether or not open source has any impact on the job market, it does. Directions Magazine summarized the ways in which open source is impacting the geospatial job market and what employers look for, and if there are any available openings, you can see an example of an Open Source Geospatial Analyst job ad right here at Penn State in our Donald W. Hamer Center for Maps and Geospatial Information.
Licenses
The OSI lists 9 "popular" licenses as well as dozens of other less well known licenses for open source software. It's a good idea to be aware of some of the most common options so that you know what terms you are accepting when you use them, and to think through potential decisions your organization might make when deciding to release geospatial system code as an open source project.
- GNU General Public License (GPL) and GNU Library or "Lesser" General Public License (LGPL) - the GPL is a widely used license which is considered "viral," in that if you use code that is licensed under the GPL, you are required to release your product and its source code under the same conditions. This makes most commercial software companies quite leery of using any GPL software in their projects. However, it interestingly lends itself well to a dual-licensing model, whereby you can have the code under a relatively restrictive GPL license for free or pay for a version of the source code without the GPL restrictions. The LGPL (Library or Lesser GPL) avoids these restrictions when you are simply linking the LGPL code to your program.
- MIT License - this is one of the most relaxed licenses in general use. You are permitted to do anything you like with the code including relicense it and sell it.
- Apache License - this is a more lawyerly license (so, it may appeal to your legal department). It requires you to provide any derivative works under the same license and to clearly mark the changes that you made. You must also keep all attributions and copyright notices that came in the original work.
- Eclipse Public License 2.0 - the EPL 2.0 is designed to be business-friendly and to support the potential for significant additions (not basic derivatives) to the original work to be licensed indepedently, including under a proprietary license.
One other important thing to note is open source licenses generally do not place restriction on their *use*, focusing primarily on what users might do with the source code instead. This is a potential advantage over commercial options, where you don't normally have access to the source code and there are normally significant restrictions on program use (install only once per user, for example).
Source Code & Accountability
You may be wondering what all the fuss is about in the above section if you have never needed to write your own programs and/or don't see a need to develop a new program based on someone else's code.
The source code is the framework that allows your computer (or server) to actually do something. All programs are written using a high level language (we'll dive into languages in a future lesson) which is compiled into machine code (the actual 1s and 0s) that tells the computer what to do. With proprietary software, you receive the compiled code. So, if you want to make changes to the program or just inspect the algorithms used to see if they are correct, examining the machine code is like trying to work out how many eggs went into an omelet. With open source software, you can usually access the source code directly (or at least have a pointer regarding where to find it). This allows you to go back to the original human readable instructions and see exactly how an algorithm is implemented. If you or your organization don't have the skills to do this, you can also hire an expert to do it for you. If you find a problem, then you can potentially fix the code and recompile it (and then contribute your changes back into the open source project for others to benefit from).
Imagine a scenario in which you are modeling floodwater extents for hypothetical future hurricane storm surge events. Let's say you find a program that helps you do it, but its source code is not available. You may have a situation then in which it would be hard to verify the final results if a decisionmaker questions the output. Are you sure it's working correctly? A lot of proprietary tools get around this issue by providing documentation in which they explain the underlying math (for example, Esri documents how Getis-Ord GI* works), but you'd still need to trust that they implemented things correctly or test with a known sample dataset, because you cannot go poke around their source code yourself.
Time & Tradeoffs
Time is another crucial element. Some of you may have had the joy of using scripting languages like AML, Avenue, or Actionscript. Those are all defunct scripting languages now, and worked in proprietary frameworks, so for the most part the tools built in those environments can't be used anymore. One potential advantage of open source development is the likelihood that others can pick up, extend, and maintain a codebase over a longer period of time. It's even possible for folks to work on porting over code from an older framework to a newer one (of course it can sometimes be easier to rewrite from scratch).
That's not to say that there aren't signficant tradeoffs associated with open source software. You can still be left holding the bag if an open source project is no longer supported, and you may end up with the painful choice of migrating away from that project or supporting it yourself. Security frameworks can be a major issue with unsupported software - your organization may need to build a security layer on top of an open source project to make it usable within your network. We'll talk more this week in our Technology Trend discussion about open source business models and their associated pluses + minuses. In my view, there are great reasons to leverage the best of both worlds, and a lot depends on the size/complexity of the organization you're in as well as the types of geospatial tasks you have to manage.