Growth of the Technology Environment and the Growth of Information
Knowledge Management workers often divide information into two groups - implicit and explicit. Implicit is thought of as existing only in the mind, while explicit is what is recorded in some fashion. When man first adopted the invention of writing, it allowed us to think of information as a set of discrete objects not limited to what was contained in the human brain. Socrates warned his student Plato about the use of writing as a danger to the means of teaching in ancient Greece, which was dependent on rote memorization. Ironically, if Plato had not written about Socrates, we might not even know about him. The advent of the Information Age allows us to see information everywhere - in locations as diverse as strands of DNA and the firing of the brain’s neurons, regardless of what those firings might mean. Now that we have such contrivances as Google and Wikipedia, we start to think of “information without end,” ubiquitous and easily accessed (this is actually a stated goal of Google – but more on this later).
How the information is related speaks to the questions of where it is created, where it is stored and disseminated, and how the geospatial components contained therein should be visualized. Claude Shannon, a mathematician with Bell Labs, created the field of Information Theory in his 1948 paper, “A Mathematical Theory of Communication.” Everything contained information encoded in some fashion as “bits” – information is actually a measure of quantity, not meaning. But then, what exactly is information? Thanks to the mathematical information theory espoused in the paper, information can be measured in bits. Shannon decided that as the fundamental unit of information, a bit would represent the amount of uncertainty that exists in the flipping of a coin: 1 or 0. Utilizing the mathematics of the day, a mathematician or engineer could quantify not just the number of symbols, but also the relative probabilities of each symbol’s occurrence. Information, as defined by Shannon, became a measure of surprise - of uncertainty with an estimated probability. These are abstractions; a message is no longer a tangible piece of material like a piece of paper. The world as it exists today promises endless possibility given the connectedness we see between the different sets of explicit knowledge. Shannon’s goal was to explain the relationships involved with sending an intelligent message over a noisy transmission line. It has gone far beyond that and eventually permeated such divergent fields as computer science and molecular biology.
One field that seemed to get left out - the most fundamental of all for our discussion - physics. Following World War II, the great area of science was the splitting of the atom and the harnessing of nuclear energy. Communications research was viewed as a field for electrical engineers. Particle physicists had quarks; they didn’t seem to need “bits.” The man to point out the contradicting view was Rolf Landauer. Landauer escaped Germany in 1938 and eventually earned a PhD in physics from Harvard. His landmark paper, “Information is Physical,” was written while he was on staff at IBM. In it, he pointed out that bits are not abstract after all (or more precisely, not merely abstract). His point is based on the concept that information can’t exist without some physical component to contain it, whether it’s a chip on a stone tablet, a typed hole in a punch card, or a subatomic particle with either spin up or down. According to Landauer, information is “tied to the laws of physics and the parts available to us exist in our real physical universe.” Landauer and his colleague Charles H. Bennett joined the fields of information science and physics by invoking the term they called the thermodynamics of computation.
Landauer showed that most logical operations exhibit no energy cost at all. When a bit flips from 0 to 1, or 1 to 0, energy is conserved, and the information is preserved. Bennett found that the one element of computing that must require dissipation of heat is erasure of the bit. When an electronic computer clears a capacitor, it releases energy. For a bit to be lost, heat is dissipated. Information might still seem a sort of abstraction, but bits are binary choices. Coin flips, yes/no, 1/0, on/off would instinctively seem to be insubstantial. The question presented then becomes one of how can they be as fundamental to physics as the building blocks of matter and energy? Earth, air, fire, and water are, in the end, all made of energy, but the different forms they take are determined by information. To do anything requires energy. To specify what is done requires information, and most information currently stored in databases contains a geospatial component. Understanding this is key. Most of the dialogue of “cyber” has been handed over from the practitioners in the intel community to the IT community. Neither community is necessarily well versed in thinking geospatially, which is in a sense why some of the dialogue about "cyber" makes such little sense and why leadership continues to struggle with it.
Once we begin to examine both static information and dynamic flows of information and begin to view it geospatially, it makes more sense.
After analyzing more than 10 million e-mails from Yahoo! mail, a team of researchers noticed a compelling fact: e-mails seem to flow much more frequently between countries with economic and cultural similarities. Among these factors that matter are GDP, trade, language, prior colonial relations, and a couple of academic cultural metrics - power-distance, individualism, masculinity, and uncertainty. The resulting paper, “The Mesh of Civilizations and International Email Flows,” was written by researchers at Stanford, Cornell, Yahoo!, and Qatar’s Computational Research Institute. Countries with measurable real-life ties such as a border, a significant number of international flights or a serious binational trade relationship tend to e-mail each other more. There are some discrepancies, as well – for example, countries within the EU tend to e-mail less than the research model predicted. The real conclusion comes toward the end, when the paper posits the results as possible evidence for Samuel Huntington’s controversial “Clash of Civilizations” theory. From "The Mesh of Civilizations and International Email Flows":
In this respect we cautiously assign a level of validity to Huntington’s contentions, with a few caveats. The first issue was already mentioned – overlap between civilizations and other factors contributing to countries’ level of association. Huntington’s thesis is clearly reflected in the graph presented…. The second limitation concerns the fact that we investigated a communication network. There is no necessary “clash” between countries that do not communicate, and Huntington’s thesis was concerned primarily with ethnic conflict.
Huntington’s ideas were controversial from the start and have been criticized frequently by many social scientists such as Steven M. Walt's critique in "Building Up New Bogeymen" and Edward Said's response in "The Clash of Ignorance," both published in different editions of Foreign Policy. What the ICT research done by Bogdan State, Patrick Park, Ingmar Weber, Yelena Mejove, and Michael Macy in their paper; "The Mesh of Civilizations and International Email Flows," seems to show, though, is that there is a greater tendency for individuals to communicate within broad groups that seem to align with what Huntington identified as the different “civilizations” between which he posited would be the source for conflict in the 21st century. It also seems to project a mapping of a dynamic flow of communication across the planet. Other mappings of the relationships between different views of information include the “Map of Science” from Los Alamos National Labs in Figure 14, (this is a view of academic papers that reference each other), the database of databases as seen in Figure 15, and finally a representation of the source articles from Wikipedia in Figure 16. All are attempts to show various distributions of knowledge or communication flows, but none are “complete” in the sense that they fully describe more than a simple one-dimensional view of a knowledge based relationship. The challenge increases when we start to add a temporal component, as in the case of software such as “Recorded Futures.” More on this later, but first, we must outline some of the aspects of data ownership, the limits of geolocating by IP address, and the impact on privacy.
While it may seem that we’re living in a borderless world where ideas, goods, and people flow freely from nation to nation, “We’re not even close,” says Pankaj Ghemawat a professor of strategic management at the IESE Business School in Spain. With great data (and a good survey), he argues that there’s a delta between perception and reality in a world that’s maybe not so interconnected at all. He posits that our world isn’t flat – it’s at best semi-globalized, with limited interactions between countries and economies, which would seem to corroborate the study above.
Listen to the Experts (Optional Talk)
This is an optional talk filmed in Edinburgh in June 2012 where Pankaj Ghemawat explains his position. His latest book “World 3.0,” describes another kind of networked economy-the cross-border "geography" of Facebook and Twitter followers. The take-away from his talk and his book describes that while there is a huge amount of connectivity present in the world currently, there is room for a great deal more (over 2/3 the planet) – with all the unpredictability that accompanies it (17:03).
Other types of knowledge have been “mapped” as well. Maps of science have been created from citation data to visualize the structure and relationships between various fields of scientific activity. With most scientific publications now accessed online, the scholarly web portals record detailed log data at a scale that exceeds the number of all existing citations combined. Such log data is recorded immediately upon publication and keeps track of the sequences of user requests (clickstreams) that are issued by a variety of users across many different domains.
Over the course of 2007 and 2008, the authors from Los Alamos National Labs collected nearly 1 billion user interactions recorded by the scholarly web portals of some of the most significant publishers, aggregators, and institutional consortia. The resulting reference dataset covers a huge part of world-wide use of scholarly web portals from 2006, and shows coverage of the humanities, social sciences, and natural sciences. A journal clickstream model, i.e., a first-order Markov chain, was extracted from the sequences of user interactions in the logs. The clickstream model was validated by comparing it to the Getty Research Institute's Architecture and Art Thesaurus. The resulting model in Figure 14 was visualized as a journal network that outlines the relationships between various scientific domains and clarifies the connection of the social sciences and humanities to the natural sciences.
The site in Figure 15 is running a powerful piece of open-source data cataloging software called CKAN, written and maintained by the Open Knowledge Foundation. Each 'dataset' record on CKAN contains a description of the data and other useful information, such as what formats it is available in, who owns it and whether it is freely available, and what subject areas the data is about. Other users can improve or add to this information (CKAN keeps a fully versioned history). Most of the data indexed at the Data Hub is openly licensed, meaning anyone is free to use or reuse it however they like. Perhaps someone will take that nice data set of a city's public art that you found, and add it to a tourist map - or even make a neat app for your phone that'll help you find artworks when you visit the city.