New Mexico Tech
Earth and Environmental Science

ERTH 401 / GEOP 501 - Computational Methods & Tools

Lab 10: Unix II--awk, grep, and sed

You will practice using awk, grep, and sed command line tools in today's lab.

Task 1: Basic awk exercises using the command line

We will work with the file called event_list.txt (list of earthquakes over the last month around the world) for these basic exercises in Task 1. Copy your command line for each part of exercise 1 into a new file called lab10_ex1.txt and turn this file in as part of this lab so we can check that your commands work.

1a) Reformat the file so that it is contains only the following columns, ordered as:

time, latitude, longitude, depth, magnitude

1b) Print columns 1, 2, and 4 of the original file using "/" as a field separator.

1c) Print the record number and length of each line.

1d) Print time (column 6) if rms (measure of location error in column 3) < 0.2.

1e) Using only awk, pull out the events that are located within this region of Oklahoma:

max_lat = 37.
max_lon = -96.
min_lat = 34.5
min_lon = -100.

Task 2)

Using grep and sed. Copy your command line for each part of exercise 2 into a new file called lab10_ex2.txt and turn this file in as part of this lab so we can check that your commands work.

2a) Use the file CDVV.pbo.nam08.csv and find all lines with -9999 in it; Note that "-" indicates a range of values in the regular expressions that grep uses and you'll have to figure out how to "escape" it.

2b) Use sed, to replace all occurrences of -9999 in that same file with NaN, save to a new file. Test with grep that the replacement was actually made, such that you get:

2005-11-28,0.55, NaN, 9.21, 1.54, 1.41, 5.88, repro,
2005-12-11,0.01, -0.49, NaN, 1.05, 0.96, 4.01, repro,
2006-01-07,NaN, NaN, NaN, NaN, NaN, NaN, repro,
2006-04-10,NaN, -2.58, 10.65, NaN, 1.49, 6.11, repro,

Plot both files using the Python script plot_gps_data.py as a basis. You will note that this is a finished version of the script we started last Monday in class during the coding session. We'd like you to explain the line:

dates    = [datetime.strptime(x, '%Y-%m-%d') for x in data['date']]

in your solution. What's happening here? You may have to look up list comprehensions.

Turn in both plots for this section in addition to your lab10_ex2.txt file.

Task 3)

Imagine now that you have a glacio-isostatic modeling code, which is called TABOO and called by taboo.sh on the command line. Each time you run it, it asks you for parameter settings on the command line (a lot of older codes work that way). In your experiments you're interested in crustal uplift and how it changes depending on mantle viscosity and elastic plate thickness. You know the elastic parameters (Young's modulus, Poisson ratio) quite well, so those would stay the same. Now, imagine each run of this code takes hours, the weekend is coming up and you're going to a conference soon. Do you really want to spend the weekend in the lab, just so you can restart the simulation with new parameter values? (Sorry, you can't occupy all the machines on campus with this, it's installed only on 1 dedicated machine. You could run a few jobs in parallel, but not all of them). Well, you could write a parameter file that sets up the code:

[denali:computing_tools/labs/lab10] rn% taboo.sh < taboo.template
This really is just a dummy script to simulate an actual simulation code. 
Let's start with reading the input:
Set Start Routine: 
Set Young's Modulus (GPa): 
Get Elastic Plate Thickness (km): 
Set Poisson Ratio: 
Get Mantle Viscosity (^19 Pa/s): 
Save Logs? (0/1):  

--------------------------

starting the run with: 
start_routine      = Make_Model
youngs_modulus     = 70
poisson_ratio      = 0.25
elastic_thickness  = THICKNESS
mantle_viscosity   = VISCOSITY
save_logs          = 1

Here, I call this taboo.template. And you'll notice that THICKNESS and VISCOSITY are place holders for the actual values. So, your job is to write shell script that:

The result should produce the following folders and files:

[denali:computing_tools/labs/lab10] rn% ls taboo/
Alaska05_25_1   Alaska05_25_300 Alaska05_30_30  Alaska05_35_3   Alaska05_40_100 Alaska05_45_10  Alaska05_50_1   Alaska05_50_300 Alaska05_55_30  Alaska05_60_3
Alaska05_25_10  Alaska05_30_1   Alaska05_30_300 Alaska05_35_30  Alaska05_40_3   Alaska05_45_100 Alaska05_50_10  Alaska05_55_1   Alaska05_55_300 Alaska05_60_30
Alaska05_25_100 Alaska05_30_10  Alaska05_35_1   Alaska05_35_300 Alaska05_40_30  Alaska05_45_3   Alaska05_50_100 Alaska05_55_10  Alaska05_60_1   Alaska05_60_300
Alaska05_25_3   Alaska05_30_100 Alaska05_35_10  Alaska05_40_1   Alaska05_40_300 Alaska05_45_30  Alaska05_50_3   Alaska05_55_100 Alaska05_60_10
Alaska05_25_30  Alaska05_30_3   Alaska05_35_100 Alaska05_40_10  Alaska05_45_1   Alaska05_45_300 Alaska05_50_30  Alaska05_55_3   Alaska05_60_100

[denali:computing_tools/labs/lab10] rn% ls taboo/Alaska05_25_1
taboo.input  taboo.output

[denali:computing_tools/labs/lab10] rn% cat taboo/Alaska05_25_1/taboo.input 
Make_Model
70
25
0.25
1
1

[denali:computing_tools/labs/lab10] rn% cat taboo/Alaska05_25_1/taboo.output 
This really is just a dummy script to simulate an actual simulation code.
Let's start with reading the input:
Set Start Routine: 
Set Young's Modulus (GPa): 
Get Elastic Plate Thickness (km): 
Set Poisson Ratio: 
Get Mantle Viscosity (^19 Pa/s): 
Save Logs? (0/1):  

--------------------------

starting the run with: 
start_routine      = Make_Model
youngs_modulus     = 70
poisson_ratio      = 0.25
elastic_thickness  = 25
mantle_viscosity   = 1
save_logs          = 1

Only turn in the script you produced for this task. We expect it to behave as described above and to run without any command line parameters.

rg <at> nmt <dot> edu | Last modified: October 30 2017 19:39.