This page is intended for PC users. If you work on a Mac, skip this page.
This short tutorial is meant to provide you with familiarity using a small number of unix commands to manipulate big text files of data. It is not meant to substitute for a complete understanding of unix, or linux, or even an exhaustive listing of useful commands but I hope that if you follow along, you'll learn enough simple file manipulation skills to save you some time.
If you are on a PC running Windows, you can emulate a unix/linux command window environment by running "cygwin." To reiterate: you aren't running linux, but it looks like you are.
To download it, go to the home of the cygwin project.
I also need to mention the following caveat: I'm not a PC user! When I was in grad school I had a Sun workstation that I used for everything, including typesetting. I didn't even write in Word, I used LaTeX. Now I use a mac. I also had to relax my principles on avoiding Word or face a lifetime of really cranky collaborators, but that's another story. The upshot here is that I am probably less of a complete doofus than your grandparents when it comes to PCs but . . . okay you get the picture. So when I made the screen casts of me attempting to present a tutorial of cygwin, I borrowed a pc and figured it out on the fly. And it basically worked okay; I give cygwin my thumbs up.
Part 1: navigating your file structure from the terminal
Commands in Part 1: whoami pwd mkdir ls man
The first thing we'll want to do is open a terminal window. Double-click the cygwin icon to open a terminal window. When the window is active there will be a blinking cursor where you can start typing. Unix commands are all typed at the prompt and by default the output of any command you type goes to the screen in the terminal window where you are typing.
Here are some commands to try:
- Type whoami at the prompt. The response should be your username.
- Type pwd at the prompt. This stands for "present working directory" and it should tell you where you are in your computer's file structure.
- You can create new folders and files, too. Type mkdir junk at the prompt in your terminal window. You have just made a new folder called "junk". You can make more than one directory at a time with just one command. If you type mkdir junk1 junk2 junk3 at the prompt you will make three directories all at the same level with those names.
- Type ls at the prompt (the first character in the ls command is a lowercase L, not a capital i). This command should give you a listing of all the subfolders and files inside the folder where you are. Do you recognize the list? It probably looks slightly unfamiliar to you because it is just a plain text listing as opposed to a pretty list with little icons to tell you the type of folder or file each item is. Furthermore, alphabetizing in unix is case sensitive, so if some of your folders and files are capitalized and some aren't, you might find that your list is in a different order than you are used to seeing.
- Now is a good time to introduce the man command. man prints the manual page for any command to the terminal screen. Try it with the other commands in this section by typing man whoami, man ls, etc. at the prompt. The man page will give you more details than are in my little tutorial about any and all unix commands, including the various options that go with each command. To page through the man page for a command, hit the space bar. To get out of the manual page and back to your terminal prompt, type q.
Part 2: navigating the file structure from the terminal
Commands in Part 2: cd
In Part 2, we'll see how to use unix commands to change our location in the computer's file structure. It's analogous to clicking through the various discs and folders from the windows launched when you double-click "my computer" except that it involves no mouse clicks, only typing.
The command of interest here is called cd. The way it works is that you type "cd pathname" at the prompt, and then you will go there (you have to type the actual path, not the word "pathname"). Nested folders have to be separated by forward slashes "/". You can verify that you are where you think you are by typing pwd or by navigating to the same address via windows and noting that the folder contents are the same.
- To move one directory upwards, type cd .. at the prompt.
- You don't need to move along one directory at a time, either. Instead of typing cd /cygdrive, hitting return, and then typing cd d at the next prompt, you can type cd /cygdrive/d to go straight to the place you want. No mouse clicks!
- If you type cd with no arguments after it, you will be returned to your original location (the default location where you started when you opened up cygwin).
Note that in order to move between the C and D drives, you'll have to start the address with "cygdrive". For example, the command cd /cygdrive/d will take you to the uppermost level of drive D and cd /cygdrive/c takes you to the uppermost level of drive C. Watch the little movie below to see this in action.
Now try this!
- Go to your original directory.
- Make a directory called "earth597"
- Go to earth597 and inside it make two directories called "data1" and "data2"
- 4. Go into "data1."
- Go from "data1" into "data2" with just one command.
First try it yourself and then you can check how I accomplished steps 1-5 above.
Part 3: copying, moving, and deleting files; redirecting output
Commands in Part 3: cat less cp mv rm head tail >
In Part 3 of this tutorial we'll do some simple things to files using unix commands. Watch the movies and then check out the written example below
Now let's take a text file and mess around with it using unix commands in the terminal window. Here is a link to a plain text file of ten days of aftershocks following the 4 April 2010 Baja California earthquake. Put it in the new directory you called earth597/data1. Go to earth597/data1 and type ls to verify the file is there. Type less baja_neic.txt (in which baja_neic.txt is the actual name of the text file). The file should look like the screenshot below. If your terminal window is too small to show the whole file at once, you will get a black bar at the bottom that tells you what percentage of the file you are seeing. Hit the spacebar and you'll see another chunk of the file. Continue to hit the spacebar until you've seen the whole file and you are back at the terminal prompt. Alternatively, if you type cat baja_neic.txt the entire file will scroll by and leave you at the prompt when it's done.
The command head baja_neic.txt shows you exactly the first ten lines of the file. Try it. You can also modify the head command like this:
head -5 baja_neic.txt
The -5 tells head to show the first 5 lines. Showing ten lines is the default when head has no arguments, so the following two commands are equivalent:
head baja_neic.txt head -10 baja_neic.txt
The command tail is similar to head but works on the end of the file instead of the beginning. The commands less, cat, head, and tail return their output to the screen by default but you can also have them create a new file and put their results in it instead. The way to do this is to redirect the output with the > symbol.
For example, do this:
head -5 baja_neic.txt > newfile.txt
and you will create a new text file called "newfile.txt" which contains exactly the first five lines of the original file baja_neic.txt. It is important to note here that performing this command has not changed the original file in any way. You can type ls to verify that you now have two files in your data1 directory. One of them is the original baja_neic.txt and the other one is called newfile.txt and it is a copy of the first five lines of baja_neic.txt. Use the less command to look at your newfile.txt file. Did you get what you were expecting? When the head command counts lines of a file, blank lines are counted just like lines that have text characters in them, so that's why newfile.txt looks the way it does. At this point, if you have been following along, the following three commands should give you identical output:
head -5 baja_neic.txt less newfile.txt cat newfile.txt
Okay, on to the next command of interest. The cp command copies one file to another but instead of using > you just specify the other filename. So, these two commands are equivalent ways of copying the entire file baja_neic.txt to a new file called baja_neic_copy.txt:
cp baja_neic.txt baja_neic_copy.txt cat baja_neic.txt > baja_neic_copy.txt
If you want to rename a file without changing its contents, use mv. Like cp, mv requires two filenames, the previous one and the new one.
mv newfile.txt baja_neic_five.txt
The above command renames the file "newfile.txt" to "baja_neic_five.txt". You can also use mv to change the location of a file. Try typing
mv baja_neic_copy.txt ../data2/baja.txt
This command takes the file "baja_neic_copy.txt" and moves it from the folder data1 to the folder data2 and renames it baja.txt. You can go to data2 (remember how?) and verify there is now a file in there called baja.txt and that it is a duplicate of baja_neic.txt.
Another cool use of the cat command is to stick two or more files together and make one file. So,
cat baja_neic.txt newfile.txt > baja2.txt
will make a file called baja2.txt which is a copy of baja_neic.txt with a copy of "newfile.txt" appended to the bottom.
Now try this!
Make a new file that is composed of the last ten lines followed by the first ten lines of baja_neic.txt
Try it yourself first and then you can check how I made the new file according to my instructions above.