In the previous lesson, you learned the basics about servers and clouds, and you got some experience setting up an EC2 instance. In this lesson, you’ll learn a little bit more about how a server can augment your GIS. You will also set up a new instance running Esri ArcGIS Server, which you will use in Lessons 2 through 4.
ArcGIS Server is just one part of Esri's ArcGIS Enterprise product suite that they market for sharing GIS across the web and internal organizational environments. We'll discuss ArcGIS Enterprise and Portal from time to time in later lessons; however, we are going to focus on ArcGIS Server here. Fortunately, the ArcGIS Server piece is relatively easy to get running in the cloud, and we'll concentrate on ArcGIS Server to understand how maps and GIS datasets you make on your desktop are exposed across the web.
Although you could potentially install ArcGIS Server on your own home or work computer, in this course, you will run ArcGIS Server on the Amazon Elastic Compute Cloud (EC2). Basically, you pay Amazon an hourly fee to run ArcGIS Server on their machines. This is an easy way to practice with a real server without compromising or adjusting your own machine. Running ArcGIS Server on Amazon EC2 also helps you learn about the cloud by using it.
At the successful completion of this lesson, you should be able to:
ArcGIS Server is Esri software that allows you to expose your GIS as a set of web services. It is just one component in a larger software suite called ArcGIS Enterprise that enables organizations to deploy their GIS onto the web. In Lesson 5, we'll talk more about the different parts of ArcGIS Enterprise.
Web services are software code or components that run on a specialized machine called a server. Web services receive requests from other apps and machines, called clients. The request might be to send some information or process some data. GIS web services do things related to sending and processing geographic information. Here are some examples of types of web services offered by ArcGIS Server.
ArcGIS Server works through the concept of distributed computing, in which you can increase the power of your server by adding more physical machines. For this reason, ArcGIS Server is made up of several different components that you can either install all on one machine or spread out among many machines. We won’t examine these components in much detail in this course, because you will have ArcGIS Server installed for you, and you will only be using one machine. However, below is a brief introduction of the most common components.
When you run a program on the Windows operating system, it runs as a specific user account and can only do things that the account can do. This is why you sometimes see Windows popping up messages that the program needs Administrator permissions to continue. That type of message means that the account running the program is not an administrator, so you need to manually confirm that it should temporarily be allowed to do something that only an administrator would ordinarily be allowed to do.
ArcGIS Server uses an account to run the GIS server, called the ArcGIS Server account [3]. This account is specified during the ArcGIS Server installation. You won't do much with the ArcGIS Server account in this course because it comes preconfigured when you run ArcGIS Server on Amazon EC2.
If you run ArcGIS Server in your own organization, you need to remember to give the ArcGIS Server account permission to read any GIS data used by the server. The account also needs permission to write to any datasets you will edit.
In this class, you’ll work with your own ArcGIS Server that runs on Amazon EC2. It has a GIS server and the ArcGIS Server account already configured. After logging in to your server, you’ll publish some map services and use them in a web app that you create. You’ll also learn techniques for speeding up your map services, using a tile cache, and how to use a map service for web editing.
This lesson gets you to the point of setting up a server, publishing a service, and making a simple web map on ArcGIS Online.
In the previous lesson, you used the AWS Management Console to set up an EC2 instance. When you build an ArcGIS Server site on Amazon EC2, you typically use a different approach, in our case, a resource called Cloud Formation [4]. This consists of a text file that pre-defines all of the parameters of the site you intend to build on the AWS platform, which can be deployed to install everything in an unmanaged manner. Cloud Formation templates exist, which can be customized to deploy the precise system you need. ESRI has developed Cloud Formation Templates that are already set up to do the heavy lifting of installing ArcGIS Enterprise in AWS, leaving us to provide only a few parameters.
It's possible to build simple one-machine ArcGIS Server sites manually with the AWS Management Console. You can even put several of these "siloed" sites under a load balancer to get more computing power. However, to get the full benefit of the ArcGIS Server architecture, in which multiple GIS servers process and balance loads in a peer-to-peer fashion, Cloud Formation Templates are the way to go.
Getting access to the ArcGIS Enterprise AMIs
Cloud Formation uses some Esri-created Amazon Machine Images (AMIs) behind the scenes to create your ArcGIS site. These AMIs have ArcGIS Server, ArcGIS Pro and in some cases, a database installed on them.
The AMIs require that you "bring your own license" and apply it to any Esri software that you run on the EC2 instances. In other words, Esri pricing is not built into the hourly fees for the instance, like it is with Windows. The Esri AMIs are accessible by anyone in the AWS Marketplace, but you must log in with your Amazon account and accept the terms and conditions for using them.
This doesn't actually launch anything right now, it simply establishes that you agree to the terms of using the particular AMI, but if you don't perform this step and accept the terms, Cloud Formation will fail when you try to create a site. In fact, if you ever experience Cloud Formation failures in the future, you should check to make sure you have accepted the software terms for the exact AMIs that you are trying to use. There's nothing else you need to do on the AMI Marketplace page.
Security Requirements for ArcGIS Enterprise
Recent versions of ArcGIS Enterprise and Server now require that all communications be performed over a secure channel. This means that anyone making a request for a map service or web app from your ArcGIS Enterprise/Server machine must do so using the https protocol rather than traditional http. You may have noticed that many websites you visit now appear with an https URL. Https uses something called, Secure Socket Layer (SSL) to encrypt all traffic that is sent between clients and the web server. In this way, any text that's sent, including passwords, usernames, and other content, is protected from hackers who might try to intercept or monitor it. Implementing SSL on a web server is good practice, which is why many websites and web services are utilizing it.
Enabling SSL on a web server isn't a trivial process, however, and it requires that an SSL Certificate be obtained and installed. SSL Certificates are issued by authoritative providers that verify the identity of your web server and provide an assurance that the communication channel clients establish with the server are properly encrypted. It makes sense that only authorized providers issue SSL Certificates, otherwise anyone could generate them and deploy them improperly. Further complicating this process is that SSL Certificates are attached to the fully-qualified domain name rather than the IP address of a web server.
Every web server has an IP number, which has the form xxx.xxx.xxx.xxx, that uniquely identifies it on the Internet, but clients typically don't use that number to communicate with it. Instead, clients (like you in your web browser) use a fully-qualified domain name to call a server. A fully-qualified domain name is a URL you would enter to visit a website, for example, www.pasda.psu.edu [6] or www.arcgis.com [7]. Domain names are linked to IP addresses using a registry called DNS (Domain Name System). Anyone wanting to attach a domain name to their server's IP must make a request to a DNS server. This request is performed by authorized Internet service providers.
So, to enable SSL on our ArcGIS Enterprise/Server machines, we need to do two things: (1) assign a unique, fully-qualified domain name to our Elastic IP in DNS, and (2) generate and install an SSL Certificate that refers to our domain name. To facilitate the setup of our ArcGIS machines in AWS, I have performed these steps for you. I assigned you a domain name in the form, namegeog865####.e-education.psu.edu, and registered it in DNS by linking it to the Elastic IP you created in Lesson 1. I also generated SSL Certificates for you using the same domain name I assigned you. That being completed, the process of installing and configuring these on your ArcGIS machines is trivial using the Cloud Formation Template; all you need to do is reference your domain name and SSL Certificate in the template and Cloud Formation does the rest.
In this part of the lesson, you'll use Cloud Formation to create an ArcGIS Enterprise site on Amazon EC2.
Before we proceed to create a new EC2 machine instance for Enterprise, I recommend that we terminate the instance and storage you created in Lesson 1. We won't use that machine or its storage subsequently, so we may as well remove it and not incur any more potential costs.
To simplify the Cloud Formation installation, we will upload a few config files to an S3 Bucket, from which the template can access them. You will refer to them later as you customize the template parameters.
Your new machine instance is now set up and ready for you to log into and start working with ArcGIS Server.
Debugging Resources:
If you receive an error in the CloudFormation Event page, you may see information about which step in the process caused the issue; the error may appear in red text on the stack page. The Event logs in CloudFormation sometimes aren't too helpful however. This is because, often, the error occurs after CloudFormation has successfully created your EC2 Instance and while the ArcGIS software is being configured on the machine instance itself. Errors in the CloudFormation template don't report specifics about any errors encountered on the EC2 Instance, rather, the errors are logged in files saved on your EC2 instance. To view those logs, check to see If your EC2 Instance was created and still appears in your AWS Management Console. (If it is not there, repeat the CloudFormation process, being sure that the "preserve successfully provisioned resources" is set to True.) If it is there, proceed to create your Windows username and password and use Remote Desktop to log into it. On your EC2 virtual machine, open a File Explorer and use the View - Options - Change Folder and Search Options settings to be sure you can see protected operating system files, see file extensions, see hiddn folders, etc. The log folder that ArcGIS generates is hidden by default.
Browse to C:\cinc and open arcgis-enterprise-primary.log in a text editor. You'll see entries with their respective timestamps as they occured during the install. Scroll through the entries in chronological order until you encounter one with a Warning or Error indicator. That should indicate what the issue was. It is very common for us to enter the name of a license file, domain name, or anything else incorrectly in the CloudFormation template. The log file in C:\cinc usually provides information we can use to deduce where the error/typo occurred. If you are unable to interpret the error logs and find the culprit, feel free to send the log file to me and we will get to the bottom of it.
Now that you have an ArcGIS Server site running, let's take a quick tour to give you a feel for what's there.
You should now have a good feel for what's running on your ArcGIS Server site and the settings available there. The next item of business is to log into the EC2 instance itself and move some data there. This will allow you to publish your own web services on the ArcGIS Server site.
Now that your site has been created and started, you can get ready to log in to the instance and start working with your software. Some of these steps will be similar to what you did in Lesson 1, but please follow them closely.
The password rules are fairly stringent; please see them in the image in Figure 2.1, below.
The following paragraph talks about disabling IE enhanced security on your EC2 machine. An alternative to doing that is to simply install the Google Chrome browser on your EC2 machine and use it instead of Internet Explorer. You may use Internet Explore to browse to the Google site to download and install Chrome.
As a security precaution, it's usually not a good idea to go around browsing the web from your production server machine. To do so is to invite malware intrusions onto one of your most sensitive computers. The operating system on your instance, Windows Server 2012, enforces this by blocking Internet Explorer from accessing most sites. This is called IE Enhanced Security Configuration (ESC).
IE ESC gets burdensome when you're using the server solely for development or testing purposes like we are. To smooth out the workflows in this course, you'll disable IE ESC right now and leave it off for the duration of the course.
Remember that if you are going away for more than an hour, you should stop your instance using in the AWS Management Console. (Only stop your machine Instance. Leave your storage volume(s) and Elastic IPs as they are. Deleting them may require that you completely rebuild your virtual machine.)
ArcGIS Server on Amazon EC2 comes preconfigured with some running services and data. These can help you understand how the server works and they're also a good way to verify that your server is running correctly. Let's take a few minutes to look at these items.
Now that you've seen what's preconfigured on your server, you'll learn a little more about how you can copy your own data onto the instance and start your own mapping web service.
One of the most challenging aspects of moving to a cloud deployment is transferring data from your local (on-premises) environment onto the cloud. In this section of the lesson, we'll look at special problems that arise in data transfer scenarios. We'll also discuss ways data can be moved to Amazon EC2, and you'll copy some GIS data to your own instance in preparation for publishing a web service.
For your data to go from your machine to commercial cloud services such as Amazon EC2 or Amazon S3, it must go "across the wire", meaning it is transferred through the Internet onto the cloud-based server. This can pose the following issues:
Let's examine these problems one at a time.
GIS data collections can be very large: up to terabytes in size. This is often the case when imagery is involved, but even vector datasets with a broad amount of coverage or detail can prove unwieldy for an Internet transfer.
When moving large datasets to the cloud, you have to plan for enough time to move the dataset and, if possible, increase your bandwidth. After doing a test transfer of a few hours or days, you should be able to get an idea of the rate of data transfer, and you can thereby extrapolate how long it would take to transfer the entire dataset.
If this amount of time is unreasonable (say, months) you may consider shipping the data directly to the cloud provider on a piece of hard media. The cloud provider can then load the data directly onto the cloud much faster than you could send it over the Internet. Amazon provides such a service called AWS Snowball [8]. You load up your data on a ruggedized secure device called a "Snowball" and ship it to Amazon. In the old days of computing this technique was called "sneakernet", since you could sometimes put your data on a floppy disk and walk it across the office to another computer faster than you could send it electronically.
Cloud-based data centers like Amazon's are built to handle high levels of data traffic coming in and out. However, your connection going out to the cloud may be limited by a slow connection or lack of available bandwidth. Some IT departments and internet service providers (ISPs) throttle or cap the amount of data that can be transferred from any one machine or node in the network. These types of policies are sometimes put in place to prevent the use of streaming sites such as BitTorrent that violate company policy or simply monopolize the organization's available bandwidth. However, sometimes these policies can negatively affect legitimate business needs such as transferring data to the cloud. If you find yourself in a situation with low bandwidth, it might be helpful to visit with your IT department to understand if your machines are being throttled and could be granted an exception. If an exception is not possible due to other bandwidth needs within the company you might explore whether your data transfer could occur during off-hours such as nights or weekends.
Confidential or proprietary datasets, such as health records, may require extra security measures for transfer to the cloud. When dealing with sensitive data, the first question to answer is whether it is legal or feasible for the data to be hosted in the cloud in the first place. For example, some government organizations responsible for national security may possess classified or secret data that could never be uploaded to Amazon's data centers no matter the measures taken to ensure secure data transfer. Also, some organizations may not have the desire or permission to host datasets on servers that are physically located in a different country.
Other types of datasets may be okay to host on the cloud but must be encrypted during transfer, to prevent a malicious party from using any data that may be stolen en route to the cloud server. Secure socket layer (SSL) connections (HTTPS) and secure FTP are two techniques for encrypting data for Internet transfer.
Sometimes the ability for one computer to directly "see" or communicate with another computer is hindered by firewalls or network architectures. For example, your computer at work is probably allowed to only access the file systems of other computers on your internal network. You could potentially open up a folder on your Amazon EC2 instance for access by anyone but this opens a security risk that malicious parties could find the folder and copy items into it.
There are a number of strategies that people use to get around these limitations when transferring data into Amazon EC2 and other cloud environments, these include:
The ArcGIS Server on Amazon EC2 help has an overview of data transfer techniques. Please take some time right now to read Strategies for data transfer to Amazon Web Services [9].
In this part of the lesson, you'll copy some data to your EC2 instance in preparation for publishing a web service. Before you attempt these steps, you should be logged in to your EC2 instance through Windows Remote Desktop Connection. If you followed the steps earlier in the lesson for connecting via Remote Desktop then your local disk drives should be available to the instance.
For simplicity in this course, you'll follow the workflow of transferring all data to your EC2 instance, working with ArcGIS Desktop on your EC2 instance, and publishing to ArcGIS Server on your EC2 instance. Theoretically, you could do most of the desktop work on your own computer and then publish up to the server when you were ready. However, any time you introduce separate computers into the architecture, especially on different networks (in the case of your home computer and your EC2 instance), things can get more complicated. Because you have a limited time available to learn about ArcGIS Server, I want you to spend the time experimenting with the capabilities of the server, not worrying about network issues or which machine contains the data.
However, in large organizations, these challenges of distributed architectures are inevitable. Some GIS shops might have a GIS server administrator who controls access to ArcGIS Server, and a number of cartographers and desktop GIS users who just prepare the maps for publishing. This latter group of "publishers" work on machines that are separate from the server and may even reside on a different subnet than the server. In some cases, the publisher machines and the server machines use different copies of the data that are kept in sync by an automated process, and the paths to the data used by the publishers may be different than the paths used by the publishers.
To help manage these scenarios, ArcGIS has the ability to "register" a data location, meaning that you provide ArcGIS Server with a list of data locations you typically use. If the publishers use a different path to the data than the server uses, you can provide both the paths. Then, when you publish a service, the map is copied to the server and all the paths in the map are switched to use the server's path instead of the publisher's path.
This can be a difficult concept to conceptualize with just a verbal explanation, so please take a few minutes to read the help topic registering data on ArcGIS Server [11]. This has some diagrams of different situations where data registration can be particularly useful. It is one of the most important help topics for ArcGIS Server.
Please note that if you try to publish a service and ArcGIS Server does not find any of the data paths in your map in its list of registered folders and databases, the data will be packaged up and copied to the server [2]at the time you publish. The copying ensures that no data paths will be broken in the published service. This automatic data copying is an interesting feature in some scenarios where the publishers do not have the rights to log in to the server machine, but it is not an appropriate workflow for managing large amounts of data. The best approach is to make sure you set up workable data locations on the publisher's machine and the server machines, and then carefully register those locations with ArcGIS Server. In some cases, like ours, the publisher's machine and the server machine will be viewing the same path to the data.
Follow the steps below to register your C:\data folder with ArcGIS Server:
Now you're ready to publish a map web service using your Appalachian Trail dataset that you placed in C:\data. You'll do this in the next section of the lesson.
In the previous part of this lesson, you copied a map document to your EC2 instance. However that map is still only available inside ArcMap on your instance. Now you'll take the step of publishing the map as a web service so that it can be used by anyone.
Whenever you publish a service, you begin the process in ArcMap, having opened the map document that you would like to publish. You run an analysis process on the map to find anything that might prevent it from being drawn by ArcGIS Server's drawing engine. You then set service properties and publish the service.
When you publish a service, you are giving the server a set of things that it can do with a particular map. In order for this to be useful to anyone, the client application and the server need to be able to communicate with each other in a way that both understand. There are several ways that an ArcGIS Server map service can allow itself to communicate with client applications.
Representational State Transfer (REST) allows a client to discover information about a service or invoke operations on a service using a known structure of URLs. REST is not really a communication protocol, but rather an architecture; a way of building a web service so that it has a hierarchy of resources and operations that can be accessed by formulating the correct URL.
The actual bits of information sent "across the wire" can vary in format, but JavaScript object notation (JSON) is often used. JSON is desirable because of its well-known structured format and the fact that it can compact information into a minimal amount of characters.
Here's an example of some JSON that describes a Pennsylvania municipalities map service [13]. Take a few moments to examine all the properties exposed in this JSON. This is actually an easy-to-read format of JSON with extra line breaks and spaces called "pretty JSON." Removing the spaces to get pure JSON makes it more difficult [14] for you to read, but reduces the information that the computer has to read and can, therefore, make your web service more efficient.
REST is stateless, meaning that any one request cannot depend on information sent in a previous or future request. All requests are independent of each other. This requirement can make for some interesting architectural considerations. For example, to support an interactive web editing session with REST, you must send an entire digitized feature to the database at once; you cannot send the feature vertex by vertex as it is digitized.
Because of REST's simplicity and efficiency, the Esri web mapping APIs for JavaScript, Flex, and Silverlight communicate with ArcGIS Server web services using REST.
Each GIS web service has its own specific purpose. It may support analysis performed inside an organization, or it may be intended to be used by anyone on the web. In this lesson, we'll assume that the Appalachian Trail service you just published is intended to be used by anyone on the web to explore and use in their own maps.
So, how could someone use your trails service in their own web map? A programmer could put the URL of your service directly into web app code and then write appropriate code to display the map. That's a topic for a different course, and ultimately writing code is something that many people cannot or will not do. In this part of the lesson, you'll use the ArcGIS Online map viewer, an interactive web map designing tool, to see how you can put together several services into a web map.
You might say that the ArcGIS Online map viewer is "running on the cloud". It is software as a service (SaaS), meaning you don't have to install any software in order to use it. When you save maps on ArcGIS Online, they are not saved to your computer, rather they are saved on an Esri server. You can come back and work with your maps from any computer as long as you tell the application who you are by logging in.
To perform this exercise, your Amazon EC2 instance must be running, but you can do the steps on your local computer.
https://namegeog865.e-education.psu.edu/server/rest/services/<your service name>/MapServer
.https
://namegeog865.e-education.psu.edu/
server
/rest/services
. You should see a hyperlink on the rest services page for your map service. Click it, and note the URL in the browser. This is what you can copy and paste into the ArcGIS Online map above.So, what good is this map that you've made? As mentioned above, if you have a permanently running server with a permanent address, you might choose to save your map and share it with the public. People could then search for and view the map in ArcGIS.com. Another way the map can be used is by web app developers. Each map saved on ArcGIS.com is assigned an ID. Esri has designed their web programming frameworks (APIs) for JavaScript, Flex, and Silverlight such that a developer can just reference a map ID in the code, rather than building the map "from scratch".
For this week's assignment, create a new document and insert the following:
When you are finished working on this lesson, remember to stop your Instance in the AWS Management Console.
How cloud computing services are defined and used is a key part of understanding cloud computing and foundational knowledge in this course. However, cloud computing definitions vary from source to source. For this week's discussion assignment, I'd like you to look back at the NIST definition of cloud computing [16] we are using in this course, and also read Chapter 2 of The Cloud at Your Service (available here as a preview from the publisher [17]). Then, compare Rosenberg and Mateos' definition of cloud computing with the NIST-based definition.
Links
[1] http://server.arcgis.com/en/server/latest/publish-services/windows/overview-register-data-with-arcgis-server.htm
[2] http://server.arcgis.com/en/server/latest/publish-services/windows/copying-data-to-the-server-automatically-when-publishing.htm
[3] http://server.arcgis.com/en/server/latest/administer/windows/the-arcgis-server-account.htm
[4] https://aws.amazon.com/cloudformation/
[5] https://aws.amazon.com/marketplace/pp/prodview-rh32a6tw3ju4a?ref_=srh_res_product_title
[6] http://www.pasda.psu.edu
[7] http://www.arcgis.com
[8] http://aws.amazon.com/importexport/
[9] http://server.arcgis.com/en/server/latest/cloud/amazon/strategies-for-data-transfer-to-aws.htm
[10] https://www.e-education.psu.edu/geog865/sites/www.e-education.psu.edu.geog865/files/data/AppalachianTrail.zip
[11] https://enterprise.arcgis.com/en/server/latest/manage-data/windows/overview-register-data-with-arcgis-server.htm
[12] https://baxtergeog865su22.e-education.psu.edu/server
[13] https://mapservices.pasda.psu.edu/server/rest/services/pasda/PennDOT/MapServer/10?f=pjson
[14] https://mapservices.pasda.psu.edu/server/rest/services/pasda/PennDOT/MapServer/10?f=json
[15] http://www.arcgis.com/features/index.html
[16] https://www.e-education.psu.edu/geog865/cloud_introduction
[17] https://www.manning.com/books/the-cloud-at-your-service