Unix Tutorial #3: Reading Text Files¶
Topics covered: File manipulation, redirection, streams, stdin, stdout, stderr
Commands used: cat, less, head, wc
The command line is useful for both viewing and manipulating text files. Manipulation means editing text - for example, replacing words in text files, or appending text from the command line to the end of a file (also known as redirection). This is useful for creating scripts, text files containing one or more commands that are run consecutively. In later tutorials, you will use these techniques to automate your analyses, which can save enormous amounts of time.
You can display the contents of a file using the
cat command, which stands for concatenate. Let’s say we have a file on our Desktop called myFile.txt, which contains the words one through fifteen (i.e., one, two, three…fifteen), with each number on a separate line. Use the command line to navigate to the Desktop, and then type
cat myFile.txt. This will print the contents of the file to your command line. This is the same idea as using the GUI to double-click on the text file to see its contents.
We refer to the output from this command as stdout, or standard output. The commands that are typed into the Terminal are called stdin, or standard input. This touches on the concept of streams, or the flow of information into and out of the command line, and we will use these ideas to give us more flexibility in manipulating text files. For now, think of stdin as anything you type into the Terminal, and stdout as what is returned if the command is run without any errors. If the command that you type does result in an error - for example, because the command was misspelled or because not enough arguments were provided - the text that is output to the Terminal is called stderr, or standard error.
cat command is useful for viewing the contents of smaller files, but if the file contains hundreds of lines of text, it is overwhelming to have everything printed to the Terminal at once. To see only a part of the file, we can use the commands
tail to see the first few or the last few lines of the file, respectively. Using myFile.txt as an example, typing
Would return the first five lines; whereas typing
Would return the last five lines. Although the default is to return five lines, these commands have an option to display any amount of lines that you choose. For example,
head -10 myFile.txt tail -10 myFile.txt
Would return the first ten lines and the last ten lines. Try these out yourself, changing the number of lines that are displayed.
In addition to displaying the results of a command, stdout can be used to move or append the output to a file, a concept known as redirection. For example, if you type
echo sixteen > tmp.txt
The word “sixteen” goes into the file tmp.txt instead of being written to standard output. Notice that it creates the file tmp.txt even if it doesn’t exist. However, if we try that again with another string - for example,
echo seventeen > tmp.txt
It will overwrite the file with whatever we printed to standard output. If you want to append standard output to the end of a file without overwriting the other data in the file, use two greater-than signs. For example, type
echo eighteen >> tmp.txt
If you type
cat tmp.txt, you will see both seventeen and eighteen.
Although these examples are trivial, redirection is invaluable for quickly editing text files and for writing scripts, which allow you to run analyses for hundreds or thousands of subjects with only a few lines of code.
Click here for a video walkthrough of commands for reading text files. This video will also show you how to read help files using the
less command and a paging window.
- Create a new file called “tmp.txt” and type whatever you want into the file. Use
catto string together both the myFile.txt and tmp.txt files, and redirect the output to create a new file. Print the contents of the new file to stdout.
- If you have AFNI installed on your machine, use
lesson the command
3dcalcto find strings matching “Example.” Now try it using the less command with an option to ignore whether the letters in the string are upper case or lower case. Hint: To find this option, search for the string “case” in the
less. (If you have FSL installed instead of AFNI, try the same exercise with the command
- Unix has a built-in command called
sortwhich will sort text numerically or alphabetically. What happens when you use myFile.txt as an argument for
sort? What about typing this command:
cat myFile.txt | sort
In your own words, explain the difference between the two methods.