This is an introduction to Open Source Intelligence or, as it is called, OSINT.
Open Source Intelligence means useful information gleaned from public sources, such as scientific articles, newspapers, phone books and price lists. Open Source Intelligence processing involves finding, selecting, and acquiring information from publicly available sources and analyzing it. In the Intelligence Community (IC), the term "open" refers to overt, publicly available sources (as opposed to covert or classified sources); it is not related to open-source software. I recommend that you skim the following:
Registered students are welcome to post comments or questions to The Conference Room discussion forum. (That forum can be accessed at any time in Canvas by clicking on the Discussions link.)
What is the difference between search and discovery? Many feel they mean the same thing, but they are actually quite different. One way to contrast the two is to classify them by what you know and don't know. That is you search for what you know and discover what you don't know.
When you search, you have a target in mind. The task is to formulate a query to maximize the chances a match. Keywords in the query tend to be more descriptive so as to qualify exactly what I am looking for. Discovery is exploratory in nature driven by a goal. A search engine becomes a discovery engine when the query is used as a starting point from which to learn more about a particular topic. Just as hyperlinks within web documents facilitate the quick navigation through related topics of information, a discovery engine provides various facets of the result set in the form of navigational links. These links represent different dimensions of the result set and allow you to drill down or sideways depending on the facet. Discovery is best done with social software, directories, professional organizations, libraries, wikis, or clustering search engines such as Clusty [4] (http://clusty.com/ [4]).
First, searches only cover 1/1000 of the Internet. They do not include:
Most people, when faced with searching in Google or any other search engine, will simply put in words describing what they are looking for and accept any and all results that appear. There are ways of telling Google what to do in order to obtain different, and hopefully more reliable, accurate, and relevant search results. To illustrate this we will do some more activities.
Open up a web browser and go to Google. Google is used in this example because these tips work extremely well with that site. You can certainly try using these tips in other search engines, but I am not sure if it will change the results.
By typing in the command "site:.gov" you were telling Google to search ONLY those sites which end in .gov. This tip will work with any website address suffix such as .edu, .org etcetera. Try out different searches and compare the results; it will change the way you use Google when you are searching for specific types of information or data.
Another tip for use in searching Google is that word order and word repetition matter. Try this:
While for this particular search the results did not change dramatically, they did change. What does this mean? It simply means that the algorithm used by Google 'reads' the words you enter and will 'think' you want more of whatever word you type first.
The same situation exists for word repetition. Try the above example again, but instead of rearranging the word order, double the words so you are doing a search first for 'geospatial geospatial intelligence' and then another for 'geospatial intelligence intelligence'. Again, the results in this particular search will not change dramatically but there will be a change and if you use this tip in other searches, the results will often be different and sometimes very different.
Another thing to keep in mind when doing searches online: Sure, the first page of results are usually the most clicked-on links, possibly the most relevant results for your search terms, but you should always look past the first page of results. At least go to the third page to make sure you are covering all of the ground you need to cover and are not leaving anything out.
This information is for your use when you are searching for information online, to let you know there are better ways of searching on the Internet than simply typing in the words for what you are searching and accepting the results that appear.
About.com [5] provides a nice sumary of how to go about this. Here they are in summary:
People: addresses, phone numbers, search for maps, and more.
Public Records: All the following sites related to the US.
Social Networks: A social network is a social structure made of individuals or organizations which are connected by friendship, kinship, financial exchange, dislike, sexual relationships, or relationships of beliefs, knowledge or prestige.
There are a large number of agencies, bureaus, and offices from which you can find data produced by the United States government. Many of the data suppliers make great efforts to supply and keep up-to-date the data they provide.
Searching for government information in the United States can sometimes be confusing and difficult to find. If you use what is called the agency approach, finding information can be a much easier process. When you are faced with a question about where to find data within the U.S. government, the first question you should ask yourself is "which agency or department is most likely to produce this information?" If it is environmental data, it would most likely be the EPA; if it is housing information, it could be either Housing and Urban Development or Census Bureau; if education data, then it would be the Department of Education. This provides you with an access point in finding the data instead of running random searches in search engines or library catalogs and coming up with irrelevant results.
While the website for each agency, bureau, or organization is different in each case, there are some elements that are useful to researches which will always be the same. There will always be an about section that should be reviewed if you are unfamiliar with the type of information and data produced within that agency. There will also be a section of the website where publications or data are made available and this is typically where you will find what you need. Some agencies and departments make it easy for you to locate the information you need, and some websites are a little more challenging. Do not get discouraged if you are having difficulty finding information you need, you can either contact the agency directly or contact your local librarian - both will provide the help you seek.
Demographic data is usually collected by national census or by some other national level survey, such as the American Community Survey conducted by the U.S. Census Bureau. The U.S. Census Bureau is the main collector and provider of demographic data in the United States. The Census Bureau collects this data in two main ways: the Decennial Census and the American Community Survey. The Decennial Census has been taking a census every ten years since 1790. This first census was short and consisted of the following questions:
The number of each of the following in every household:
The questions changed over time and the amount of data the Census Bureau collects on modern census questionnaires allows for detailed analysis of the population in the United States.
Recognizing that the population changes dramatically in the ten years between each decennial census, the Census Bureau established the American Community Survey in 2005. In 2009 it is very difficult to make a reliable map based on the statistics collected in 2000, so the American Community Survey collects data on a yearly basis for a select sample of the population in efforts to provide more up-to-date demographic data for the United States. It should be understood that, like some parts of the census, this is sample data only and should not be treated as an absolute.
To access the data from either a Decennial Census or from the American Community Survey, you would use the American FactFinder on the U.S. Census website.
Here is an activity for you to familiarize yourself with using the American FactFinder:
Note that there was a population change for both whites and blacks in Camden, NJ, over the seven year period. While this is not a significant increase in population overall, for what has often been called the poorest city in the United States any change in population can be significant. This data can also be used to make some assumptions about where the people are moving by looking at some factors in the city of Philadelphia, right across the river from Camden.
To do this, let us take a look at housing price changes in Philadelphia over this seven year period.
Through this process of data comparison, you will start to get an overall picture of what is happening socially in different areas of the country and the world (if you are using international data). This exercise is valuable not only because you become more familiar with using the Census Bureau website but because you start to think about different variables that could possibly be affecting demographic data shifts over time. The data sets you worked with are just a part of all the data available via the Census Bureau so you are encouraged to look around and experiment with making tables, maps and downloading the data.
Another rich resource from the United States Government is the Environmental Protection Agency. The EPA collects many different types of data regarding the environment; most often data retrieved from the EPA has to do with emissions and environmental atrocities conducted by large industrial corporations. While this information is available from the EPA and they have a gleaming new website design, this is one of the instances where finding the data on the agency's website is somewhat challenging.
The EPA has an interesting data tool called the Envirofacts Data Warehouse. In the this data warehouse you can find data on waste, water, toxics, air, land, radiation, and compliance and also make some maps of your local area displaying data found within the warehouse.
For this activity there will be little guidance; it will teach you to work through a website to find the data you need.
The CIA World Factbook (https://www.cia.gov/library/publications/the-world-factbook/index.html [29]) is an online almanac of country information provided by the Central Intelligence Agency. It is the primary resource used for locating general country information. Each entry contains information on the Geography, People, Government, Economy, Communications, Transportation, Military, and Transnational Issues for all of the countries in the world. The CIA Factbook is also a rich resource for reference maps of the world available for free download in pdf format.
The United States Census Bureau also provides access to valuable international population data via their International Database (IDB). Data in this resource goes as far back as 1950 with projections into the future until 2050. It is an interesting tool to use when trying to get a sense of what is happening in any given country as far as population numbers are concerned.
From here you will do a little study of population growth in countries. First we will look at country rankings.
Next we are going to look at fertility rates for the countries found in the rankings
To view this population data in a more interesting format click on the Country Summary link on the main IDB page.
These sites are by no means all there is available to you online. These are just a small sample of the places you can find information that will help you build a solid intelligence case using open source information. There are hundreds of other sites with very valuable data available at no cost. You are encouraged to continue searching and mastering the different methods provided on each site to extract data.
The United Nations is an organization made up of member countries who serve together to fight for human rights, world wide peacekeeping, environmental rights, to fight against drug trafficking and landmine use as well as working in the fight against HIV/AIDS. The UN has headquarters in New York City but also has locations all over the world, most notable the International Court of Justice at The Hague, located in the Netherlands. The United Nations has fifteen specialized agencies, which include but are not limited to the World Health Organization (WHO), the Food and Agriculture Organization (FAO), the World Bank Group, and the International Monetary Fund (IMF).
The amount of data and information available from the United Nations and the specialized agencies is staggering. You can access a majority of this information via the UN Library online at http://www.un.org/Depts/dhl/ [30]. Data and information for the specialized agencies is available on their respective websites and some databases provided at various academic libraries across the country. Resources from the UN are available in a number of different languages. The UN Library offers some very well-made reference maps, major documents coming from the General Assembly, voting records and speeches given at the UN. You can also search for your local United Nations depository library for acquiring any UN materials in print. There is no activity for this section because a whole class can be given just on UN documents and there is just not sufficient time or space to do any of it justice. You are encouraged to explore the online library and can feel free to ask any questions regarding UN information, if you have any.
Searching for data and information on the Internet using a basic search engine such as Google or Yahoo often does not provide you with results from the wealth of information that can be found in libraries. As a student you have access to any of the resources provided by the University Libraries; however, once you separate from the University, those resources become limited. Google provides a way to search for professional and scholarly information via its Google Scholar search.
Google Scholar indexes scholarly materials by publishers and libraries who have agreed to work with Google Scholar on this project. When you are searching Google Scholar you are searching an index of journal articles, professional societies, websites of faculty, books as well as peer-reviewed papers amongst other types of scholarly information. In order to see library materials in your search results, you must set your preferences in Google Scholar. To do that follow these steps:
If you go back to the Google Scholar search screen and run a search for "geospatial intelligence," the following results will appear. Note--on these results you can choose between "All Articles" or "Recent Articles"--a good tool if you are looking for the most recent papers published on a particular topic. As you scroll through the searches, you will see there are links provided to pdf files, websites of academic institutions, and, if you have your library settings in place, you will also have links to holdings in whichever library you chose.
Since I had Penn State as my library setting, you will see highlighted in the image links where it will take you directly to that resource at the University Libraries. This resource is a great way to search the broader academic community to find out what is being published in different disciplines. One of the other advantages to searching for this type of information in Google Scholar is that it is also searching open source journals, many of which are not accessible through libraries. While it does serve a critical need in the realm of information searching, Google Scholar is in no way a replacement for searching in your local library or in Penn State's libraries collections.
Use Google Scholar to find three articles on geospatial intelligence. Don't take the top three either; find ones you think are interesting and informative.
Google Hacks: Tips and Tools for Finding the World's Information by Rael Dornfest
It could be useful to build a custom search engine. Building a custom search engine allows you to search only websites you select. To build a custom search engine visit this site http://www.google.com/coop/cse/ [40]
Throughout this lesson, we stressed that data is one of the most important parts of geospatial intelligence. Being able to locate data is a critical skill to learn when you become involved in any kind of intelligence-gathering activity. This lesson serves to provide you with a basic understanding. There are a number of places online where you can find freely available data for download and use in mapping projects. The data fall into two main categories--National and International. Within each of these categories, you can find data that are governmental and non-governmental. Typically, governmental data are easier to find than non-governmental data unless you are dealing with large multi-national organizations, such as the Organisation for Economic Co-operation and Development (OECD). The websites discussed in this lesson were:
Because all of these organizations are run by different countries, boards, individuals, etc., they do not all provide the same easy access to their data. Often it takes some time searching through a website trying to determine where you can find the data you need, how to download it and then how to use it. Sometimes it can be a frustrating task, sometimes it is a breeze. In either case you should always allow plenty of time for doing research and locating data because it is not given that the data you are seeking are out there already and easy to find.
In this lesson you learned the basics of how to perform more sophisticated searches online, using mostly Google. You became familiar with some of the larger governmental and non-governmental organizations who provide large quantities of useful data. Along the way you came across some activities that will help develop these skills and illustrate some of the points being made. You should do these searches whenever they are provided because the best way to learn these skills is to perform them. Once you get a handle on the different types of searches, you should experiment with finding some of your own data to test your success.
Before you go on to Lesson 2, double-check the Lesson 1 Checklist [41] to make sure you have completed all of the activities listed there.
Registered students are welcome to post comments or questions to the The Conference Room Discussion Forum. (That forum can be accessed at any time in Canvas by clicking on the Discussions link.)
Links
[1] http://en.wikipedia.org/wiki/Open_Source_Intelligence
[2] http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&sqi=2&ved=0CCQQFjAB&url=http%3A%2F%2Fwww.oss.net%2Fdynamaster%2Ffile_archive%2F030201%2Fca5fb66734f540fbb4f8f6ef759b258c%2FNATO%2520OSINT%2520Handbook%2520v1.2%2520-%2520Jan%25202002.pdf&ei=QLXlUIKYMIvU0gH04IGIBw&usg=AFQjCNEWBpknAXj2kqQ5-W9l5ks0bm-iRw&sig2=aq5xrJCMU0BFGUSUJ9FzPw&bvm=bv.1355534169,d.dmQ&cad=rja
[3] http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&sqi=2&ved=0CB4QFjAA&url=http%3A%2F%2Fwww.oss.net%2Fdynamaster%2Ffile_archive%2F030201%2F254633082e785f8fe44f546bf5c9f1ed%2FNATO%2520OSINT%2520Reader%2520FINAL%252011OCT02.pdf&ei=KLblUK7eDYzV0gGx84HwBg&usg=AFQjCNGmVFiRkySLk0EYzsk5TchuOC8PbQ&sig2=QNttfJza3Y4liCtFMIu4QQ&bvm=bv.1355534169,d.dmQ
[4] http://clusty.com/
[5] http://About.com
[6] http://websearch.about.com/od/peoplesearch/tp/googlepeoplesearch.htm
[7] http://websearch.about.com/od/peoplesearch/a/zoominfo.htm
[8] http://websearch.about.com/od/peoplesearch/tp/free-people-search-engines.htm
[9] http://websearch.about.com/od/peoplesearch/a/zabasearch.htm
[10] http://websearch.about.com/od/peoplesearch/qt/spock.htm
[11] http://websearch.about.com/od/peoplesearch/f/birthday.htm
[12] http://websearch.about.com/od/peoplesearch/qt/yahoopeople.htm
[13] http://websearch.about.com/od/governmentpubliclegal/ht/obituaries.htm
[14] http://websearch.about.com/od/peoplesearch/a/militarysearch.htm
[15] http://websearch.about.com/od/wendyssearchpicks/a/find_people.htm
[16] http://websearch.about.com/od/peoplesearch/tp/find-someone.htm
[17] http://websearch.about.com/od/governmentpubliclegal/a/firstgov.htm
[18] http://websearch.about.com/od/dailywebsearchtips/qt/dnt0724.htm
[19] http://websearch.about.com/od/governmentpubliclegal/tp/publicrecords.htm
[20] http://websearch.about.com/od/dailywebsearchtips/qt/dnt0428.htm
[21] http://websearch.about.com/b/a/181479.htm
[22] http://websearch.about.com/b/2008/04/08/see-how-easily-you-can-track-your-friends-with-spokeo.htm
[23] http://websearch.about.com/od/dailywebsearchtips/qt/dnt0720.htm
[24] http://websearch.about.com/b/a/185873.htm
[25] http://websearch.about.com/od/blogsforumssocialsites/qt/twitter.htm
[26] http://websearch.about.com/b/a/181682.htm
[27] http://websearch.about.com/b/a/217557.htm
[28] http://www.census.gov
[29] https://www.cia.gov/library/publications/the-world-factbook/index.html
[30] http://www.un.org/Depts/dhl/
[31] http://scholar.google.com/
[32] http://www.eia.doe.gov/
[33] http://nucleus.iaea.org/NUCLEUS/nucleus/Content/index.jsp
[34] http://trade.gov/index.asp
[35] http://www.oecd.org/home/0,2987,en_2649_201185_1_1_1_1_1,00.html
[36] http://www.ngdc.noaa.gov/hazard/hazards.shtml
[37] http://www.wri.org/
[38] http://www.who.int/en/
[39] http://www.noaa.gov/
[40] http://www.google.com/coop/cse/
[41] https://www.e-education.psu.edu/geog594b/l2_p2.html