Basic Shell exercises

Finding your way around

Use ls to list the contents of the directory you are in
Use pwd to print your working directory
Use mkdir shell_excercises to make a directory called shell_excercises
Use cd shell_excercises to change directory into the shell_excercises directory
Use pwd again to confirm the directory you are in
Use cd .. to go up one directory from your current directory
Use cd, pwd and ls to navigate around your university home directory and find your files
Use cd ~ to get back to your home directory (where you started when you first logged in)

Creating and manipulating files

Before you start change directory to the shell_excercises directory you created in the excercises above

Use touch my_example_file.txt to create an empty file
Use nano my_example_file.txt to open the file in the text editor
Add some text to file over a few lines. When you are done press ctrl+x and save when prompted
Use cat my_example_file.txt to show the contents of the example file in the prompt
Use cp my_example_file.txt second_file to create a copy of the file
Use cat second_file to confirm the contents is the same
Use mv second_file second_file.txt to move (rename) the file to a new location
Use rm second_file.txt to remove the file, it's a duplicate anyway
Use wget http://www.soton.ac.uk/~pm5c08/unix/student_marks.tsv to download a marks spreadsheet
Use wc student_marks.tsv to see the number lines, words and characters in the file

Finding and filtering

Use grep Wheeler student_marks.tsv to find the marks for Patrick McSweeney. Think about other patterns you could use to produce the same result
Use find ~ to list every file in your home tree. What would you do to list every file in /tmp ?
Use find ~ | grep doc to list all the word documents in your home tree. Think about what the short comings of this pattern might be
Use man grep to get some ideas for how you might might improve the pipeline in question 3
Use head -n30 student_marks.tsv to see the top 29 students marks. Now use head -n30 student > 29students.tsv to make a .tsv of just the top 29 students. Why is it only the top 29 students?
Use cut -f1 student_marks.tsv to see a list of all the students usernames. From this starting point add sort and uniq to the pipe line to create a list of usernames which is sorted alphabetically with the duplicates removed.
Use cut, sort and uniq to generate a list of the usernames which have at least 2 entries in the spreadsheet.
Use cut, sort, uniq and grep to generate a list of usernames which appear exactly twice
Produce a pipleline which tells you how many usernames appear in the file exactly 3 times.

Hard problems

Two PhD demonstrators have been marking some coursework. They have been sharing a combined marks spreadsheet for the class by email rather than using a shared drive. At some point they have become confused and have both been adding to different spreadsheets which started from the same email. To add to the confusion one of them sorted the spreadsheet into username while the other left it in its original sort. This is now your problem to fix. Write some shell scripts to create a copy of the original spreadsheet they both started from and a copy of the a spreadsheet of each demonstrators marks. The spreadsheets can be found at http://www.soton.ac.uk/~pm5c08/unix/demonstratorA.tsv and http://www.soton.ac.uk/~pm5c08/unix/demonstratorB.tsv

Write a python script that reads tsv data from the standard input and calculates the average of the marks fed into it. It should print the average to the standard output.

Create a pipeline which takes every spreadsheet (there should be 6) now in your shell_excercises directory and computes the class average. The script should take into consideration that some marks will be duplicated across different files. You should then also write a pipeline which tells you how many marks were used to calculate the average. Finally write a pipeline to spot students who's work has been marked twice by accident. How many students does this effect?