GEOG 583
Geospatial System Analysis and Design

Technology Trends: Cloud, Edge, and Fog Computing

Technology Trends: Cloud, Edge, and Fog Computing

Cloud computing is, in some ways, a new term for an old thing: keeping data and computational processes in some central location (historically a mainframe) and accessing them with a less-powerful client, or, in other words, a client-server architecture. However, there is something new going on here in that we can catch some of the glory that is the Internet. Using the Internet as a transport mechanism means that client-server computing can be used in more situations, and more robustly than has previously been possible. If this topic interests you, you might consider looking into our Cloud and Server GIS course.

Here, we will take a look at a few examples of the available technologies for geospatial cloud computing that enable people and organizations without internal computational resources to take advantage of massive data processing and storage systems on an as-needed basis to access and analyze big data quickly. We'll also take a look at the emerging areas known as Edge and Fog computing, which extend/evolve the concepts of Cloud computing. Here are a few questions to think about to jump-start discussion on this topic:

  • How has the Cloud changed the way we conceptualize geospatial system development and applications?
  • What are the implications for traditional geospatial companies and geospatial careers of players such as Google and Amazon moving into geospatial analytics in the Cloud?
  • What impacts will geospatial cloud computing have on society?

Video 1: Cloud OnAir (33:50)

The first video is a somewhat technical presentation by a pair of individuals at Google presenting Cloud OnAir: Add rich geospatial analysis to your toolbox with BigQuery GIS. The presenters, Chad Jennings (product manager for Google Cloud BigQuery) and Soliel Kelley, (Product Marketing Manager, Data Analytics Team, Google Cloud), provide the following description for the video content: “New geography data types and GIS functions are now first-class citizens in BigQuery, Google Cloud's serverless, MPP data warehouse. BigQuery GIS provides a convenient and powerful way to incorporate spatial information into your analytics workflows, even as your datasets grow into the petabytes. Whether you're in IoT, telematics, consumer apps, transportation, retail, or manufacturing, tune in to learn the basics and see how easy it is to get started.” Not surprisingly, given that one presenter is from the marketing arm of the company, the video is kind of an infomercial for the product – but it does provide a lot of detail. If you want more technical detail or want to access the tools, check out Google Cloud's Introduction to BigQuery Geographic Information Systems (GIS).

Click here for transcript of the Cloud OnAir: Add rich geospatial analysis to your toolbox with BigQuery GIS video.

CHAD JENNINGS: Hello, everyone. Welcome to Cloud OnAir. These are live webinars put on every Tuesday by Google Cloud Platform. My name is Chad Jennings, and I'm a product manager for BigQuery.

SOLEIL KELLEY: Nice, Chad. My name is Soleil Kelley. I'm a product marketer on the data analytics team here at Google Cloud. And today, we're going to talk about geospatial analysis and BigQuery, in particular, doing that with BigQuery GIS. Do note you can ask questions at any time on the platform. We have Googlers on standby to answer your questions.

So our goal today is really a couple fold. We want to, A, introduce you to the BigQuery GIS service if you're not yet familiar with that, and, B, get you up and running so you can add GIS into your tool box as an analyst. And let's jump right in.

So spatial context really matters from our perspective. Maps provide this really unique type of context. And when we think of context, think of the who, what, when, where, whys, and hows. In particular, that where-- when you add your data and you put it on a map, it just instantly becomes real. It becomes relatable as a human. And that's really powerful to help you make better business decisions.

It, second of all, gives you this rich extra set of data at your disposal, and there's a nice benefit there because when you put your data on that map and you join it with other things, you instantly have context in physical space, right? Those maps connect that physical space to that very, sometimes, intangible space of data analysis on a computer connected to the cloud, right?

And so yes, maps really matter from the human perspective and also from the business perspective. So take, for example, this map in Denver. I might be a retailer in this particular location in Denver, and I would want to analyze where my customers are, or the different demographics from different neighborhoods in the Denver area. And maybe I'm considering opening up a new store in a different neighborhood. I would also want to factor in other competitors in the area and think about different physical geographies that are present in this space as well.

And when I actually overlay my demographic analysis on top of the map as opposed to just looking at tables of data, I instantly just have more context. I can see where the freeways are that might be gridlocked during the time I want my customers to come to my store. I can see physical geographic boundaries, like bodies of water or mountain ranges, and things of this nature. So really useful to be able to contextualize your data in maps, for sure.

Now, if you have just a small set of data, you can imagine almost doing this analysis by hand. If you had just a few hundred customers and they're all coming from a certain neighborhood, easy enough. But in our age of information and the accessibility we have to just beaucoup data from all the open data sets, from weather to census data, and other sources, you can really go pretty wild from an analytics perspective. Think about all the fun things you can do with space.

And for that, you might be thinking about if you had millions of customers, let's just hypothetically say you wanted to join that with some of those bigger, bigger data sets, you might need a little bit more horsepower to actually conduct that GIS analysis. And for this, we're incredibly excited to have launched BigQuery GIS into beta a few weeks back here in September.

And really, what this means for analysts, especially those that are already using BigQuery, is that hey, we're bringing the GIS functionality right directly into the data warehouse. You no longer would need to export your data elsewhere and bring it into a purpose built GIS application, right? All that core functionality is right there on top of your data and accessible right there.

And what's really unique and different about this service is combining this cloud GIS functionality with the raw power of BigQuery. And it's massively parallel processing your MPP architecture to do big data, to kind of merge big data and GIS in the cloud. And that's super exciting. We're really excited to bring that to market. And Chad being product manager for BigQuery, he's going to dive into a little bit more details on this, so.

CHAD JENNINGS: Yeah, and I just wanted to say, though, that that intersection of big data and GIS is the thing that we're really excited to address in the marketplace. So with the launch of this, BigQuery is the only cloud MPP enterprise data warehouse to support GIS data types and functions as first class citizens. And so we're really proud of that.

And the engineering team has worked on this actually in conjunction with the Earth Engine team, so we use the same computational libraries under the covers that power things like Google Maps, Earth Engine, Google Earth. So it's really bringing a lot of really awesome Google assets to our customers, which is fun.

SOLEIL KELLEY: [INAUDIBLE]

CHAD JENNINGS: All right, but first off, you may have a question-- what is this BigQuery thing anyway? Let's start there.

SOLEIL KELLEY: Sure.

CHAD JENNINGS: So BigQuery is Google Cloud's enterprise data warehouse. So you interact with it in SQL, and it is serverless and fully managed. And what that means is you don't have to mess with spinning up nodes or spinning up clusters. We handle all of that for you. And so what that means is you bring your data, you bring your workloads, you load them both, you press the button that says Run Query, and we spin up all the compute resources and storage resources that you need. That's what fully managed means to us.

This product scans-- goes super big and super fast. Our largest customers have hundreds of petabytes of storage with us. And our largest queries regularly exceed 20 or almost 30 billion rows. So big data, GIS together, right?

SOLEIL KELLEY: There's that intangible data analytics thing and the computer box thing again, right?

CHAD JENNINGS: Yeah, really, what does 30 trillion actually mean?

SOLEIL KELLEY: I have no idea. Just a lot.

CHAD JENNINGS: So here's what BigQuery GIS actually means, right? We're supporting geographic data types, geographic functions, and we're going to go through these in some detail in just a second here. And then kind of where the payoff happens is we're also launching something called-- or we have launched something called-- BigQuery Geo Viz, which is a lightweight visualization tool to put all of those cool data that you just figured out onto a map.

SOLEIL KELLEY: Yay.

CHAD JENNINGS: All right, so a little context setting-- why bother? Euclid would look at this map and go, that's perfectly fine, right? That is a straight line between two points, shortest distance, that's how [INAUDIBLE] should go, except, as we all know now, right? Euclid didn't know this, but we do. Curved Earth-- the shortest distance between these particular two points, Seattle and Stuttgart-- anyone curious-- is a great circle route.

And to be honest, even though I come from a navigation background, I never really quite got what a straight circle route truly meant until I looked at it visualized on Google Earth from space. So Euclid was right. A straight line between two points, that is the right way to go if you want to get there fast. And that is what a straight line looks like on the curved space.

So with geographic data types and functions, we want to honor the curvature of the Earth, right? Seems like a big thing to honor-- and actually do these calculations exactly right. So these are the data types that we're suppointing-- uh, supporting-- suppointing-- supporting-- point, linestring, polygon, all the way down to collections. So it's quite a rich data set. All right, and now Soleil will take us away with the functions.

SOLEIL KELLEY: Yeah, for sure. So we love our SQL verbs, and we've just brought in about 40 new verbs into the BigQuery as first class citizens again and that conform to the PostGIS project spatial type function convention, the ST_. And we have a number of different functions here you can see in the table on the right.

If you are familiar with PostGIS, this will be a walk in the park, and you can maybe grab a super quick glass of water or something. But if you're just getting started and want to just know at a high level what these functions are all about and kind of the things that you can do on your geospatial data, we're just going to dive into that super quick.

Constructors, as the name suggests, these are really about building new geographies from existing coordinates, so say, a lat-long pair, or existing geographies like a couple of polygons or lines and making a collection. So the diagram here demonstrates a set of five different lat-long pairs and making a line out of those, right?

Parsers and formatters-- I mean, obviously, want some interoperability between different formats, and so these are all about creating or exporting geographies into different formats, so from binary to a polygon from GeoJSON to text and things like this. These are the functions that you would use to do that, so that you have a little bit more interop with the other programs.

Transformations, again-- so these are creating new geographies similar to constructors. But they're having the similar properties as their input geographies. So here in the diagram, we've highlighted the centroid function. So if you wanted to find the center of some sort of zip code polygon, you'd use that function to create a point, a lat long set out of that existing geography and many other types of transformations naturally as well.

Predicates-- so, great for filtering. Are the data within this region existing within another region in this particular zip code or something? Yes or no, or true false questions, rather, so great for filtering your geographic data and whatnot.

Accessors-- so sometimes you just need to know a little bit of the metadata about your geography data. And so for this, we have a number of functions here, like, how many vertices or how many points are there as part of this polygon? So you can ask those types of questions. Is it a point? Is it a line? Is it a polygon? If you just get a whole bunch of geographic data, you could ask those types of questions there.

Measures, as the name suggests, pretty intuitive. But what's the perimeter? What's the area? Distances between points, et cetera. These are real core functionality--

CHAD JENNINGS: Right, not flashy, but important.

SOLEIL KELLEY: No--

CHAD JENNINGS: And here's the flash.

SOLEIL KELLEY: Super, super important. I mean, these are what you immediately think about GIS data, or at least for me. You're asking those questions, like wait, what's the difference between x and y? And these are the questions you often have, but yeah, joins are--

CHAD JENNINGS: So this is where the real magic starts. And doing joins on geographic data sets-- in the demo we'll talk about in a second, we actually join on zip code. But we're actually joining on the zip code integer, right? With these functions, you can actually-- sorry, the integer that measure or that identifies the zip code. With these, you can actually do joins on the geography. Like, find all the points that are in these two data sets, join them together with any of these predicates. So this is where the magic really happens here.

SOLEIL KELLEY: Cool.

CHAD JENNINGS: OK. But in terms of eye candy, this is the magic. So this is BigQuery Geo Viz. And you can see from the GIF here that you can compose a query, run a query, and then style the results in a map all interactively. It's a lightweight tool. So this isn't going to handle millions upon millions. It's limited to about 2,000 points. But it solves the use case of I'm an analyst, I wrote a query. Please let me see that on a map, just to make sure that I'm sane or that I got the results that I expected.

SOLEIL KELLEY: Yep. Great for ad hoc exploration. Yep.

CHAD JENNINGS: Exactly. And if you've got more serious mapping needs, you can export a table from BigQuery into GCS and then import that into Earth Engine. And here you use JavaScript, and you can create maps of arbitrary complexity and arbitrary beautifulness. Is that a word?

SOLEIL KELLEY: That should be.

CHAD JENNINGS: There it is. Okey doke. So we'll dive into a couple demos here. And so referencing the example that Soleil talked about earlier, we're going to pretend that we are retail site selectors. And so we have a store. And Soleil, the target demographic of your store is?

SOLEIL KELLEY: 25 to 34.

CHAD JENNINGS: How about 25 to 44 since that's--

SOLEIL KELLEY: 25 to 44. Yeah.

CHAD JENNINGS: Since that's what I prepped.

SOLEIL KELLEY: Yeah.

CHAD JENNINGS: Yeah, let's do that one. OK. All right. So here, let's cut over. Let's cut over to the demo. And so what we--

SOLEIL KELLEY: Oh yeah, what are we looking at here?

CHAD JENNINGS: Yeah, yeah, yeah. So this is the BigQuery Web UI. And what we see here is on the left panel here, there are a bunch of, like, basically asset navigators, right? You can look at your queries. You can look at saved queries, the job history that you've run in this project. You can even look at data sets down here. I'll double click into the BigQuery public data. And there's a whole bunch of stuff. The baseball data set's pretty fun. We're not going to do that one today. Sorry. But--

SOLEIL KELLEY: Unless my customers are baseball folks.

CHAD JENNINGS: Right. Right now, it's a retail shop.

SOLEIL KELLEY: OK.

CHAD JENNINGS: Any case, the left panel here to navigate assets. This window right here is the query composer window. So this is where you write your SQL, and then you get your results back down here in the lower pane. And so what I'm going to do is I'm going to walk you through this query real quick.

So one thing that I like to do a lot in SQL is use these with statements to pull parameters up to the top of a query. It just means that if you want to share that query with somebody or if you want to adjust a parameter, you don't have to go searching through lines and lines and lines of SQL to get to it.

So we're going to set parameters. We're going to pick latitude and longitude. All right, that's the center of Seattle. So we'll pretend we're going to put a store there. You could get this very simply by, like, just googling center of Seattle or googling an address, and it'll return you the lat-long. And then we're going to stipulate-- for this one, we're going to stipulate the radius as 1 mile.

And then this set of code pulls all the zip codes within that area, so it uses this STD within, so it creates a point from the latitude and longitude, and then makes short, and then it looks at the zip area latitude and longitude and finds all of the zip codes that are 1,609 meters or within 1,609 meters.

SOLEIL KELLEY: Happens to be one mile.

CHAD JENNINGS: One mile, then. That's right. Thank you. The next set of this code is where we're going to pull the stats. And so this table that we're looking at is actually available in public data set inside of BigQuery. It's called, no surprise, population by zip 2010. And so what this code is doing is simply adding up the population totals from these different demographic buckets. This data set only has age and gender.

If you look at the US census page or the American Community Survey, the Fact Finder page is really useful for this. They have many, many, many more demographic buckets, but we'll focus on these. And then at the end, we're just going to pick all of those zip codes, the zip code stats, and the zip code geometries, and we're going to pull them out into a single table. And so run the query.

And it was cached, right? The 0.017 is-- it used the cache. I prepped this ahead of time because I didn't want to burden you all with watching the query run. But what you can see here is this table. So here's 98154. This is actually a tiny little zip code that's just for the purposes of the US Post Office, so no people live in it. But you can see these are the populations, and then here is the polygon. And that polygon string is totally parsable by human readers. And you look at that, and you're like, oh, -122.333564, yeah, that's downtown Seattle, right?

SOLEIL KELLEY: I can see it.

CHAD JENNINGS: Right, no. Nobody does that. So what we really want to do is we want to see that on a map. So let's walk through how BigQuery Geo Viz works. So I've actually prepopulated this one as well. And you can see that I've increased the radius to 15 miles. It's exactly the same query. So I've copied from the composer window and pasted into the BigQuery Geo Viz window, and run the results to get this map over here.

And what's cool about this tool is that you can style interactively. So the fill color for this choropleth, I have chosen to be population of 65 plus. But if I wanted to change that, I could in real time. And as a matter of fact, we're going to dive in to the north end of Seattle here. You know what? That's a little bit opaque for my taste. I'm going to lighten the fill opacity. I'm going to make it 0.5. There, that lightens it up a little bit. It's a little easier to see.

Go back to fill color, and we're going to look at a couple of different demographics. So demographic number one--

SOLEIL KELLEY: There's my 25 to 44, thank you.

CHAD JENNINGS: Right on. So let's see where your target demographic is. We'll change the range a little bit since they're--

SOLEIL KELLEY: Yeah, the max is 21.

CHAD JENNINGS: Yeah.

SOLEIL KELLEY: 21,000 there, got it.

CHAD JENNINGS: Yep.

SOLEIL KELLEY: Mhm.

CHAD JENNINGS: And so what you're going to see is if we zero in on this zip code, so 98103, there's a concentration here. And there's a dearth of your target demographic in 98199. So don't put your store out there.

SOLEIL KELLEY: Nope.

CHAD JENNINGS: But what I wanted to point out here was let's go ahead and have a look at-- well, let's expand it. Let's say you were looking for college age students. And then this zip code here lights up. What's there?

SOLEIL KELLEY: There happens to be a university there.

CHAD JENNINGS: Right.

SOLEIL KELLEY: And we haven't even changed the range, but you can see it's actually a similar range set there, but--

CHAD JENNINGS: Oh, right. Yeah, I can do that.

SOLEIL KELLEY: Very high concentration there of college age students.

CHAD JENNINGS: Right. Yep. So no surprise, the University of Washington lights up. And then if we look at the 65 plus, and I adjust the range here to 7,000, you can see the populations are starting to move not just north, but out, right? So this is a flight from the urban center, I suppose. And folks in this demographic are moving away from the city center. OK.

So what we've shown here is the ability to do geospatial analysis, and then style and map and visualize and map in real time. If you wanted to share this with folks, right, you can just take a screenshot or share the query.

SOLEIL KELLEY: Yeah, you can even make that nice and big.

CHAD JENNINGS: Oh, yeah. There we go. So whoops, sorry, we can zoom out and see the entire extent.

SOLEIL KELLEY: Yep.

CHAD JENNINGS: Okey doke. So geospatial information is useful partly because it's not focused in any one particular area. So I ran this query again for New York. And so again, this is the 0 to 24 demographic. And if we just click over to this other tab here, you can see that-- and sorry.

This one-- let's see. The styling-- this one is the 65 plus. I'll switch back and forth between these tabs. You can see that these neighborhoods out here in the south of New York get quite a bit darker. So what that means is folks are kind of moving out to the beach as they get tired of the city life.

OK. So that's interesting. So now we've given these retail site selectors the tools to go ahead and look at different areas and see what the demographics are. To be honest, zip codes are pretty coarse grained geometry for this analysis. You'd rather use census tracts or census blocks. Again, you can get those from the American Community Survey page. Go to the Download tab. That's where you can download census tracts or zip codes for the entire United States and bring that into BigQuery.

We're going to look at this last query here because this is kind of the summary table. So, same kind of construction. So with stats by zip code, this is actually the same code as before. And what we're going to do here is we're going to run those stats for a collection of radii. So we're going to create a summary table. Show me the list of people that live 1, 10, 20, 50 miles from my chosen location. And again, we're using the BigQuery GIS functions to construct that filter. All right, and then here's the resulting table.

And now this is not GIS, but it is super convenient. You look at this button here called Exploring Data Studio. You can actually click on that, and that will materialize the results in a Data Studio session. Let's go ahead. So let's see. The dimension we're going to use is R. So that's the radius. We'll get rid of that one. And then, let's see. We'll do population.

SOLEIL KELLEY: While Chad is pulling in these different demographic groups as well, [INAUDIBLE] something we just launched into GA last week, which is super exciting. And this particular functionality that integration between BigQuery and Data Studio, that one-click UI experience is something that we launched earlier this summer at our Cloud Next event. And it's been a very popular feature. Our customers have been asking about it. And it was really nice to be able to deliver that, and people have been responding real well to that. It's been fun.

CHAD JENNINGS: Excellent.

SOLEIL KELLEY: And it's just super quick for-- just like we had the BigQuery Geo Viz application to be able to quickly explore your geospatial data, you can do the same thing here with summary tables, with other data to quickly visualize it in Data Studio.

CHAD JENNINGS: Yeah. And so with just a few clicks and a little bit of clumsy dragging of these little tickets, you can create a chart. You can then save it, copy it to report, and share it with folks, and they can interact with your query as well. So anyway, we wanted to get that one out. We've got one more demo to talk about, which is actually a totally different persona.

So now we're done with being real estate moguls or retail moguls. Now we're city planners from Chicago. And so our customer, Geo Tab, actually built this application. And so what you're seeing here is a map of hazardous driving behavior. And you might naturally ask yourself the question, well, how does Geo Tab know anything about hazardous driving application? Great question. Thanks for asking.

Geo Tab is an asset tracking company and a telematics company. And so for example, FedEx-- not FedEx-- UPS--

SOLEIL KELLEY: UPS.

CHAD JENNINGS: All the UPS trucks have a box about yay big in their truck. And that box measures location, velocity, acceleration, plus a host of other variables.

SOLEIL KELLEY: Temperature.

CHAD JENNINGS: Exactly. So Geo Tab actually has an incredibly rich data set collected by 1.2 million vehicles running around the country. I think their data set, their daily intake is about 3 billion points. All of that gets stored inside of BigQuery, which you'll find out soon is a very convenient place to do it.

So what the map shows here is areas in Chicago that register hazardous driving behavior. And that's characterized by extreme amounts of acceleration, either forward and back or lateral. And what this left panel is going to do is we're actually going to combine a few different technologies here in just a few clicks. So we have BigQuery GIS, which is going to call out the points from Chicago. We have obviously Google Maps.

And we have BigQuery ML, which is going to actually-- they've actually trained a model to predict hazardous driving behavior, i.e., those accelerations, based on weather data. Now, the next question is, like, where did you get the weather data? Another excellent question. Like, he's awesome. NOAA actually hosts weather data in BigQuery. And so joining your data with weather data is literally only a join away because it's all hosted in the same backend storage.

All right. So here, that's enough context setting. Let's get to it. So we're going to dial up some weather conditions. So I'm reducing the temperature. So we're going to make it winter. Reduce the visibility. I'm going to order a snow storm, and then we're going to pretend it's the holiday season, and we're going to bump up the traffic volume. And then I just click Run Predictive Analysis, and we get a map that's a lot hotter than the original one. OK, that's interesting.

SOLEIL KELLEY: Makes sense, although colder because it's winter. It's hotter in terms of hazardous driving behavior.

CHAD JENNINGS: I totally get you. Any case, we're going to look in at one of these because as the traffic planner for the city of Chicago, like we want to investigate these points and see what we can do. And in particular, there's one that we're going to dive into right here. Oh, here it is. Because it is just down the street from a school. So being elected officials, right, we're going to prioritize safety of constituents. We're going to prioritize safety of vulnerable constituents, focus on hazardous driving around schools.

SOLEIL KELLEY: Yep, new efficiency through the network of streets for--

CHAD JENNINGS: Yeah, we just--

SOLEIL KELLEY: --everybody, you know?

CHAD JENNINGS: We want to keep our kids safe. All right, so what's going on here? So it's interesting that there's hazardous driving behavior in inclement weather. But what's going on? So this is where we get a little bit of a benefit from being in the Google ecosystem. And we're going to drop the Street View avatar into the scene.

And so we're going to turn around. So here's the school. And then I'm going to move just a little bit east, and look at what's here. So I'll make this screen a bit bigger for you, and we'll zoom in just a touch. It's a bike rack.

SOLEIL KELLEY: There you go.

CHAD JENNINGS: Right next to an alleyway. So agreed, I dealt up a winter day, but maybe some kids are riding their bike down this alleyway and causing some kind of traffic congestion or traffic issue here. Let's spin around and see what's going on. I don't see any traffic signage. Right? So just down here at the end of this picture is where that hazardous driving behavior was occurring. But there's no traffic signage here. Maybe the right remediation is to preposition some sand. Maybe the right remediation is to put a stop sign in here.

SOLEIL KELLEY: Either a crosswalk, yeah.

CHAD JENNINGS: Or a crosswalk, yeah, good point. But what we've enabled here is our city planner can now scan the entire city using GIS, and now BigQuery, Google Maps, Street View, public data, all without leaving their chair. And that is pretty darn cool. Oh here, I'll take it out of the full screen mode. All right. Anyway, so thank you for going on that little journey. So we did that demo. We just finished this one. I skipped ahead in the slides a little bit. And--

SOLEIL KELLEY: Yeah, so this is obviously amazing things that you can do using this functionality-- tons more resources we wanted to arm you with. First of all, if you want to get started in BigQuery, you can just go right to the Cloud console. That's the first link there. That particular BigQuery Geo Viz application for just visualizing your SQL queries on that map is the-- there's the link there for you as well.

Tons of documentation, very thorough, all the functions that I went through, those are all detailed out one by one in the documentation, which is really helpful. We also have a link for all those public data sets that we host in BigQuery. That's in our GCP marketplace.

You can also, of course, go find any of the open data sets out there and bring them into your particular projects as well. And then too, we have a Stack Overflow topic here on Google BigQuery with the GIS particular tag. And this is where we'll post-- Chad will post all these queries sometime in the next few days.

CHAD JENNINGS: Yep.

SOLEIL KELLEY: So that you can play around with those. Again, you saw how the facility of just putting in a lat-long pair for that center point of your retail ring study. But you could conceivably do that at any location that you might want to explore.

CHAD JENNINGS: Let me speak specifically. If there are people watching who are either in retail, or in television, or radio ratings, then these queries are very readily extensible to census tracts, census blocks, or DMA's. So we didn't go to that extent because we just wanted to keep things simple here. But if that's your industry, then use those as your template. And you can do these analyses and show them on BigQuery Geo Viz.

SOLEIL KELLEY: Great. Thanks. Well, that's what we had to show for today. Thank you, Chad, for walking through those demos. Those are super cool. And folks, everybody stay tuned for live Q&A. We'll be back in just a couple of minutes to cover those. Thank you.

Great. Welcome back, everybody. So we're here now for the live Q&A portion of our webinar. And we got several great questions from the audience. So we'll just kind of dive into those and do one at a time. So first off, I have a few ESRI shapefiles I'd like to use. Super, super common. Can I use them with BigQuery GIS? The answer is yes, although you would need to convert that shapefile format into either well known text or some of the other formats that we can then bring in, right?

CHAD JENNINGS: Yeah. We got this question a lot right after we launched the alpha. And so we actually have-- and we'll put it up in the Stack Overflow topic-- we actually have a document that one of our colleagues wrote that details exactly how to do-- what's the right tool to use and how do you bring it in. But essentially Soleil is right. You have to convert the shapefile into the formats that we support. And then you can look at them, just like we did in the demos.

SOLEIL KELLEY: Next question, does BigQuery GIS have geocoding capabilities?

CHAD JENNINGS: Yeah, so we rely on the Google Maps APIs for this. So BigQuery itself does not, but in a different part of your program, you can call an API and augment the table that you're looking at with the geocoded data. Or you can call any external API as part of a Dataflow job. So you can read out of BigQuery column through a Dataflow job, then call that API, augment the data, and then write that back into BigQuery.

SOLEIL KELLEY: Got it. I see. Great. For visualizations and mapping, what other BI tools can I connect BigQuery to? So BigQuery has, I mean, a number of native connections to different BI tools, like Tableau, and Looker, and Click, and things like this, specifically for mapping and connecting to your BigQuery GIS functions. So long as those tools support custom SQL queries and so long as they can render geographic data types, they should be able to leverage this technology. And one thing as well is you probably want to bring that into a GeoJSON format, right, to do that.

CHAD JENNINGS: Yeah. So Tableau is an example of a Viz tool that you'll want to use the custom queries to leverage BigQuery GIS. And then Looker actually supports BigQuery GIS.

SOLEIL KELLEY: OK, wonderful. Cool. Next question, does BigQuery GIS support 3D geometries and measure values xym or xyzm? I'll let you take that.

CHAD JENNINGS: Oh, I actually don't know the answer to this one.

SOLEIL KELLEY: The answer is no.

CHAD JENNINGS: Oh.

SOLEIL KELLEY: We don't support the z measure.

CHAD JENNINGS: Thanks for that one.

SOLEIL KELLEY: Hey, these are questions that people are having, you know. Does BigQuery GIS come with geospatial statistical data? For instance, personal map data that I can use to join with my business data.

CHAD JENNINGS: Oh, that's a great question. So, not really. So BigQuery GIS comes with BigQuery. BigQuery comes with Google Cloud Platform. And inside Google Cloud Platform are a whole host of public data sets. And we went through some of these, or I showed you a very small subset of the list. If there is some public data sets that have some of that geospatial statistical data that you want-- like, I do happen to know that zip code land and water areas are there, things like that-- you might be able to find them in the public data sets. If not, then you're going to have to import them using the procedure that we'll publish in that Stack Overflow topic.

SOLEIL KELLEY: Next question looks like we've already addressed with respect to geocoding. For our final question there, for non-programmers working with large amounts of data, is there a resource for query language to facilitate pulling or geolocating data? So I would just for this, if you're not a programmer, I would just direct your attention to the BigQuery documentation for GIS. There are a ton of resources there. Again, it's all fully outlined, all of the different functions that are there. There are several tutorials as well. We'll post some links to the Stack Overflow there as well.

CHAD JENNINGS: Yeah, but--

SOLEIL KELLEY: And anything you want to just add to that?

CHAD JENNINGS: Yeah. Yeah, I do, definitely. So as we all know, right? We work with geospatial data. And for those that have know that there are data sets all over the place and in all sorts of different formats. And aside from shapefiles, which is I suppose a bit of an industry standard, but there's just like GDB, MDB. There's a whole lot of stuff out there. We don't have a SQL verb that says, like, go get this data set and bring it in. However, it's a really good idea, and we've already written it down.

But what you will have to do is if you're a non-programmer, have a look at the resources. Again, this article about pulling in other types of data formats into BigQuery, and then the process to copy that over requires a couple lines of code, but you can literally copy and paste from the article and put that into the console. And the directions there are clear enough that even I was able to get it right on the very first time.

SOLEIL KELLEY: Amazing.

CHAD JENNINGS: I was not the author, too, so I was actually testing.

SOLEIL KELLEY: Great. Well, thanks, everybody. That concludes our Q&A portion. Do stay tuned. We have another webinar following this one. It's called Visualize 2030. This is about a data storytelling contest that Google Cloud is hosting around the UN's sustainable development goals. And that'll be coming up in just a few moments live from New York City. Thanks again, everybody.

CHAD JENNINGS: Outstanding.

SOLEIL KELLEY: My name's Soleil.

CHAD JENNINGS: I'm Chad, and happy mapping.

SOLEIL KELLEY: Woohoo.

[MUSIC PLAYING]

Video 2: Andy Fang, DoorDash | AWS Summit New York 2019 (16:10)

The second video highlights a successful startup company built on Amazon Web Services (AWS). It presents an interview of Andy Fang, co-founder of DoorDash done at the AWS Summit 2019.

Click here for a transcript of the Andy Fang, DoorDash | AWS Summit New York 2019 video.

NARRATOR: Live from New York, it's The Cube, covering AWS Global Summit 2019. Brought to you by Amazon Web Services.

STU MINIMAN: Welcome back. I'm Stu Miniman with my co-host, Cory Quinn. And we're here at the AWS Summit in New York City, where I'm really happy to welcome to the program a first-time guest but somebody that has an app that's on my phone. So Andy Fang, who's the CTO of DoorDash, gave a great presentation this morning. Thanks so much for joining us.

ANDY FANG: Absolutely. Happy to be here, guys.

STU MINIMAN: All right. So before we dig into your Amazon stack, bring us back. You talked about 2013. Your mission of the company to help empower local businesses.

ANDY FANG: Yes.

STU MINIMAN: I think most people know DoorDash [INAUDIBLE] delivery from my local businesses, whether that is a small place or Chipotle or the like there. And I love a little anecdote that you said. The founders actually did the first few hundred deliveries. But give us a little bit of the breadth, the scope of the business now.

ANDY FANG: Absolutely, yeah. When we started in 2013, we started out of a dorm on the Stanford campus. And like you said, we were doing the first couple hundred deliveries ourselves. But fast forwarding to today, we're obviously at a much, much different level of scale. And I think the one thing that I mentioned about my keynote is we've been trying to keep up pace and more than doubling as a business every year. And it's a really fascinating industry that we're in in the on-demand delivery space in particular. Dara, the CEO of Uber, himself said in May-- which is a month and a half ago-- he said that the food delivery industry may become bigger than the ride hailing industry someday.

STU MINIMAN: So just one quick question on food delivery. Because I think back when I was in college. I worked at a food truck. It was really well known on campus. And there are people that 20 years later, they're like, Stu, I remember you serving me these sandwiches, and I loved it. And in the community, we'd gather and we'd talk. Today on campus, nobody goes to that place anymore, because maybe I know my delivery person more than I know the person that's making it.

So I'm just curious about the relationship between the local businesses and the people, how that dynamic is changing in the gig economy. You guys are right in the thick of it.

ANDY FANG: It's a great question. I think for merchants, a lot of the things that we talk to them about is you're actually getting access to customers who wouldn't even have walked by your store in the first place. And I think that's something that they find to be very captivating. And it shows in the store sales data when they start partnering with DoorDash.

But we've also started building our products to really get customers to interact with the physical neighborhoods they're in. The most concrete example of that is we launched a product called the In-Store Pickup product where you can order online, skip the line, and can pick up the order yourself in the store.

And I think the way we can build the app experience around that-- you can actually start building a geospatial browse experience for customers with the DoorDash app, which means that they can get a little bit more familiarity with what's around them as opposed to just kind of looking at it on their phone themselves.

STU MINIMAN: All right. So the logistics of this are not trivial. You talked about 325% order growth. Your database is billions of rows, just a massive scale, massive transaction. Therefore, you're an app. And at the scale you're at, technology is pretty critical to your environment. So bring us inside that a little bit.

ANDY FANG: Yeah. We're fortunate enough-- and you and I were talking before the show-- we're kind of born on the cloud. And we started off actually on Heroku back in 2013. We adopted AWS back in 2015. And there's just so many different services that Amazon Web Services has been able to provide us. And they've added more over time. I think the one that I talked about was one that actually came out only in early 2018, which is the Aurora Postgres product. And we've been able to scale our databases, scale up our analytics infrastructure. We've also used AWS for things like real time data streaming. They have the Cloud Watch product where it gives us a lot of insight into how our servers are behaving.

And so the AWS ecosystem in and of itself is kind of evolving. And we feel like we've grown with them, and they're growing with us. And so it's been a great synergy over the past couple of years, for sure.

STU MINIMAN: As you take a look at where you started and where you've wound up, can you use that to extrapolate a little bit further as far as what shortcomings are you seeing today that ideally would be better met by a cloud provider? Or at this point, is it such a simpatico relationship, as you just alluded to, where you just see, effectively, you're continuing to grow in similar directions just out of, I guess, happenstance?

ANDY FANG: Yeah. It's a good question. I think there are some shortcomings. For example, AWS just recently launched NKS, which is their in-house confidence solution. We're looking for something that's a lot more vetted. So we're considering do we adopt AWS version, or do we try to do it in-house, or do we go with a third-party vendor that's--

STU MINIMAN: Confluence is hard to say no to these days.

ANDY FANG: Yeah, exactly. And I think we want to make sure that we are building our infrastructure in a way that we feel confident and can scale. With Aurora Postgres, it's done wonders for us. But we've also been one of the pioneers, AWS, for scaling up product. And I think we got kind of lucky in some ways there in terms of how it's been able to pan out.

But we want to make sure-- the stakes are a lot higher for us now. And so when we have issues, millions of people face issues. And so we want to make sure that we're being more thoughtful about it. AWS certainly has matured a lot over the past couple of years, but we're keeping our options open. And we want to do what's best for our customers. And AWS, more often than not, has a solution, but sometimes we have to consider other solutions and consider the fact that AWS may or may not solve some of the future problems we're facing.

CORY QUINN: Oh, yeah. I think that what's easy to overlook sometimes with something like a food delivery service-- it's easy to make jokes about it. Oh, what, you're too lazy to cook something? And sure, when I was younger, absolutely. Then I had a child. And when she wasn't going to sleep when she was a baby, and I only had one hand, how do I feed myself? There's an accessibility story. People aren't able to easily leave the house. So it's not just people aren't able to get their wings at the right time. This starts to become impacting for people. It's an important need.

ANDY FANG: Yeah. And I think it's been awesome to see just how quickly It's been adopted. And I think another thing about food delivery that people don't necessarily remember about today is it was primarily just a very dense urban area phenomenon. Obviously in Manhattan, where we are today, food delivery has existed forever. But the suburbs is where the vast, vast majority of the growth in the industry has been. And it's just awesome to see how this use case has flourished with all different kinds of people.

STU MINIMAN: I have to imagine there's a lot of analytics that are going on for some of these. As you said, in the rural areas, the suburban areas, you've got-- it's not as dense. And how do you make sure you optimize for the people that are doing so? So what are some of the challenges you're facing there, and how is technology helping?

ANDY FANG: Exactly, yeah. I mean, with our kind of a business, it's really important for us to get into the lowest levels of detail, right. Just because we're growing 325% year on year in 2019, maybe we're growing faster in certain parts of the United States and growing slower in others. And that's definitely the case.

And so one of the awesome things that we've been able to leverage from our cloud infrastructure is just the ability to support real time data access. And our business operators across Canada and the United States, they're constantly trying to figure out, hey, how are we performing relative to the market in our particular locality, meaning not just the state of New York, but Manhattan, which district in Manhattan? All that matters with a business like ours where it's such a hyper-local economy.

And so I think the real time infrastructure, particularly with things like with Aurora, the fast replicas, we're able to actually get a lot of read hits to these replicas because it's not affecting our write volume. So that's been really powerful, and it's allowed our business operators to just really run and sprint.

STU MINIMAN: So Andy, I have to imagine data is one of the most important things of your business. How do you look at that as an asset? Is there new things that-- new services that you can be putting out there, both for the merchants as well for the customers?

ANDY FANG: Absolutely. I think one of the biggest ones we try to do is-- we never give merchant direct access to the customer data because we want to protect the customer's information. But we do give them insights as to how they can increase their sales and target customers that haven't used them before.

So one of the biggest programs we've launched over the past few years is what we call Try Me Free. So merchants can actually target customers who've never placed an order from their store before and offer them free delivery for their order from that store. So that's a great way for merchants to acquire new customers. And it's a simple concept for them to understand.

And over time, we definitely want to be able to personalize the ability to target these sorts of promotions. And so we have a lot of data to do that. And we also have data in terms of what customers like and what they don't like in terms of their order behavior, in terms of how they're rating the food, the restaurants. So that kind of dynamic is something that is a pretty interesting dataset for us to have. You look at other local companies out there, like Yelp and Google Maps. They don't actually have verified transaction information, whereas we do. So I think it's really powerful for merchants actually have that to make decisions off of.

CORY QUINN: It's a terrific customer experience. And it almost seems, to some extent, to be aligned with Amazon's professed customer obsession leadership principle to some extent. And the reason I bring that up is you mentioned you started on Heroku, and then in 2015, migrated off to AWS. Was it a difficult decision for you to decide first to effective go all in on a single provider, and secondly, to pick AWS as that provider?

ANDY FANG: It wasn't a hard decision for us to go to a cloud provider that was ready to showtime. I'd say Heroku is more of a student project kind of scale at that time. I don't know what they're doing today. But I think AWS at that time was still very, very dominant. I think we were considering Azure, and GCP, I think, was kind of becoming a thing back then.

AWS was always the most mature. And they've done a great job of keeping their lead in this space. Google and Azure have cropped up. Obviously, Oracle Cloud is coming up too. We've considered the capabilities of something like Google Cloud. Their machine learning services are really powerful. They actually have a really sophisticated-- probably more so than AWS. The Kubernetes service is actually more sophisticated. I guess it's built in-house at Google, so that makes sense.

But yeah, we've considered the landscape out there. But AWS has served a lot of our needs up to this point. And I think it's going to be a very dynamic industry with the cloud space. And there's so much at stake for all these different companies, and it's fascinating to just be a part of it and kind of leverage it.

STU MINIMAN: Yeah. So Andy, I'm guessing when you look at some of your peers out there, and when a company files an S1, and everybody goes, oh, my god, look at their cloud bill, how do you look at that balance? You said in your keynote this morning you have, like, less than a handful of engineers working on the data infrastructure. So that line item of cloud, I'm guessing, is non-trivial from your standpoint. So how do you look at that internally, and how do you make sure you keep control and keep flexibility in your options, yet focus on your core business and not the infrastructure piece of it?

ANDY FANG: That is such a great question because it's something that we get-- we think about that tradeoff a lot. Obviously in the early days, what really mattered, ultimately, is do we have product market fit and do we have something that people care about. So optimizing around costs obviously was not prudent earlier on. Now we're at such a large scale, and obviously the bill is very big, that optimizing the cost is a very real thing.

And part of what keeps you satisfied with staying on one provider is the ease of the setup and what you already have configured there. And we've done optimizations over the years. We have folks on our finance team now who is basically looking at, hey, where are our areas where we're being extremely inefficient? Where are our areas where we [INAUDIBLE]?

And this is not just on AWS. This is on all our vendors. Obviously, AWS is one of our biggest, if not the biggest line item there. And we just take it from there. And there's always tradeoffs you have to make. But I know there's companies out there that are trying to sell the value proposition of being able to optimize your cloud span. And that is definitely something that there's a lot of-- I'm sure there's a lot of places to cut costs in that we don't know about. And so yeah, I think that's something that we're being mindful of, I'd say.

STU MINIMAN: Yeah. The challenge, too, you see across the board is that there's a lot of things you can do programmatically with a blind assessment of the bill. But without business insight, it becomes increasingly challenging. And you spoke to it yourself where you're not going to succeed or fail as a business because the bill winds up getting too high unless you're doing something egregious.

It's a question of growth. It's about ramping. And you're not going to be able to cost optimize your way to your next milestone unless something is very strange with your business. So focusing on it in due course is almost always the right answer.

ANDY FANG: Yeah. When I think about increasing revenue or decreasing costs, nine times out of 10, we're trying to provide more value. So increasing revenues is usually the go-to option for us. But there are some times where it's obvious. Hey, there's a low-hanging fruit in cutting costs. And if it's relatively straightforward to do, then let's do it. And I think with all of the cloud infrastructure that we've been able to build on top of, we've been able to focus a lot of our energy and efforts on innovating, building new things, cementing our industry position. And yeah, I think it's been awesome to flourish on top of it.

STU MINIMAN: Want to give you the final word. Any interesting insights in your business? It's like, I like food, and I like eating out. And it feels like we've kind of flattened the world a lot. It's like-- I think it was, like, five, six years ago the first time I went to Hawaii. And I got introduced to poke. And everybody in California knows poke. But I live on the East Coast. Now I've got, like, three places within half an hour of me that I can get it.

So those kind of things-- what insights are you seeing? What's changing in the marketplace? What's exciting you these days?

ANDY FANG: Yeah. I mean, for us, we've definitely seen a phenomenon where different food brands percolate across different areas, and start in one region and then spread out across the entire United States or even to Canada. I would say-- I don't know. We try to have as much merchant selection on the platform as possible so that no matter what the new, hottest trend is, that more likely than not, we're going to have what you want on the platform.

And I think what's really exciting to us over the next couple years is-- last year, we actually started-- we started satisfying grocery delivery fulfillment. In fact, we power a lot of grocery delivery for Walmart today, which is exciting, and a lot of other grocers lined up as well. We're going to see how far we can take our logistics capabilities from that standpoint. But really, we want to have as many options as possible for our customers.

STU MINIMAN: Well, Andy Fang, thanks so much for joining us. Congratulations on the progress with DoorDash. For Cory Quinn, I'm Stu Miniman. We'll be back here with more coverage from AWS Summit in New York City, 2019. Thanks, as always, for watching The Cube.

Over the Edge, into the Fog

While the previous two examples focus on fairly conventional Cloud applications where computation needs to be scalable and the datasets in play are big ones, the concept of Cloud computing has evolved today to include new flavors like Edge Computing and Fog Computing. Edge Computing is a response to new networked sensors that can ingest enormous amounts of data and may also be driving real-time decisions that require rapid responses. Think about the computing you need in a self-driving car, for example - you need an array of sensors detecting everything around the car, and you need the ability to make very rapid decisions based on those input streams. There's no time in that setting to upload all the data to the Cloud to have it computed on there - computing instead needs to happen more locally, and it may be beneficial as well for the sensors to only be sharing back the most meaningful bits of the data they capture as Cloud resources, rather than the entire streams. Cisco has also described something they call Fog Computing, which they attempt to explain here (if you read that and decide you can distinguish it from Edge Computing, you're better at this than I am!). It is referenced often interchangably with Edge Computing, and for what it's worth, despite Cisco's entreaties, there is a general consensus that they're not all that distinguishable from each other.

To see an example of how Edge Computing works, here's a short video by GE describing how they use Edge Computing to manage wind farms:

Video 3: Sensors + Data = Edge Computing (2:53)

Click here for transcript of the Sensors + Data = Edge Computing video.

ON SCREEN TEXT: What do driverless cars, rescue robots, and renewables have in common? Machines connected to the internet of things and the cloud will have to make accurate split-second decisions without help from humans. To do that successfully machines will use edge computing. Edge computing turns sensor into mini data centers, which allows self-driving cars to safely re-route if an obstacle emerges, could allow robots to someday perform rescue missions after a disaster and allows wind turbines to communicate with others around them.

DANIELLE MERFELD: The cloud is like a lake. It's a sea of information that you can fish in, you can learn from. You can sift it in different ways and find new ways to look at your data. At the edge, it's a fast-flowing river. You will never build a reservoir big enough to catch all that data. Industrial data is just generated too fast, and it's too much. But you can make decisions based on what you see and change the behavior of your asset real-time.

A great example of where we use our edge computing and edge capabilities in the renewable world is with our digital wind farms.

By using edge analytics and edge computation, we can actually create a flock of wind turbines so they act as a team. And ultimately, we can get more power.

ON SCREEN TEXT: Integrating software with turbines can enable energy production increase by as much as 20%. As software identifies how wind speed and turbulence affect each turbine it makes adjustments to the turbines to optimize energy production.

ROGIER BLOM: For a wind farm operator, it's extremely important that they squeeze out the maximum performance out of their assets. And in order to do that, we really need to understand how the variability in the wind drives those limits.

DANIELLE MERFELD: So when the front turbine is receiving a lot of wind and it's blocking much of the wind for the turbines behind it, they might actually be more optimized if that first turbine pitches or yaws to allow more wind through so that the overall wind farm can provide more output. That kind of computation and decision making can only be done at the edge.

ON SCREEN TEXT: GE is developing a wind farm optimizer app that works at the edge. It chooses the best settings to balance the tradeoff between power output and turbine life.

ROGIER BLOM: Making that trade-off is extremely complicated because to do that, you really need to understand how the variability of the wind affects the turbine. And that's precisely where our controls at the edge comes in, leading to more power output for the wind farm operators.

Credit: GE Reports. "Sensors + Data = Edge Computing." YouTube. November 6, 2017.

If you find this all very interesting, the OGC has recently developed a white paper which goes over the Edge-Fog-Cloud nexus and its relationship to geospatial architectures - you can check that out here.

Joining the Discussion

Please find the Lesson 5 Technology Trends Discussion Assignment in Canvas for details on completing this activity.