Notes on Configuring BINDER_EW: Earthworm's Phase Associator

by Lynn Dietz, July 2002
  Given a list of P-wave arrival times, a set of station locations and a 
simple velocity model, binder's job is to identify the smallest set of 
hypocenters that could have produced the current list of arrival times.  
With each new pick, binder tries to:

  1) associate that arrival to an active event, or 

  2) identify a new event by stacking the back-projections of P-wave 
     travel times on a space-time grid to find a likely source (hypocenter) 
     for recent unassociated picks.  

  Sounds simple, right?  Well, there are a lot of configurable parameters
that control exactly how binder will go about this task.  I'll try
to explain the basic steps I go thru when I'm setting up binder for the
first time in a new region of interest.  And I'll give a few details 
about how binder goes about its tasks (refer to this flowchart as we go).


Getting Started: Earthworm Setup & Network Definition
-----------------------------------------------------
The three most basic things binder needs to know are:
1. Where should it get P-wave arrivals (picks) to work on?
2. What stations are being picked and where are they?
3. What's the velocity structure like?

1. P-wave arrivals:
  Two commands in binder's configuration file tell it where it should get
the P-wave arrivals to work on.  The "RingName" command specifies which 
Earthworm transport ring (a shared memory region) to read from.  Make sure
that this is the same ring into which pick_ew is writing its output.  Also, 
if you're importing picks from another source and want binder to use them, 
make sure you are placing them in this ring as well.  
  The "GetPicksFrom" command specifies the installation id and module id of 
the TYPE_PICK2K messages that binder will operate on.  Specific id's or 
wildcards can be used for either or both of these arguments.  Note: both 
binder_ew and eqproc should be set to listen to the same pick source(s)! 
  
2. Station List:
  Binder expects the network's station list to be in a file in Hypoinverse
station format; it gets the name of the file from the "site_file" command.
Of all the fields in this file, the only ones used by binder are the 
station codes (5-letter site code, 2-letter network code, 3-letter component
code) and the station latitude and longitude.  A quirk of this file is that
all lats and lons are listed as positive numbers; a letter is used to denote 
N,S,E,W, with blanks interpreted as N and W.  As binder reads the file, it
converts lat and lon to decimal degrees with N and E being positive.  If a 
site has multiple components, each component should be listed on its own line 
in the station file. As of Earthworm v4.0, binder treats each component 
independently, so if multiple components at a site are picked, that site 
will be weighted more heavily in stacks and locations than a single-component 
site. 
  Large networks (over 1000 channels) will have to use the "maxsite" 
command to increase the number of stations that binder will input and use.
  Since binder works only on arrival times, it doesn't need to know anything
about instrument response or gain.
  
3. Velocity Model:
  Binder uses one simple velocity model to calculate travel times for the 
entire region of interest.  If the velocity structure varies greatly in
your region, make sure you give binder a decent average model.  This 
velocity model is specified with a series of "lay" commands in the 
configuration file, listed in order of increasing depth.  The velocities 
in each layer are constant, and should increase with depth (no low velocity 
zones, please).  Pay attention to the order of the arguments of the 
lay command.  It should be "lay depth(km) velocity(km/s)".  If you get
depth and velocity swapped, binder won't do much of anything.  
  Binder also uses one P/S velocity ratio; the default is 1.72, but it can 
be changed by using the "psratio" command in the configuration file.


Stacking - The Guts of Binder
-----------------------------
  How does binder identify new earthquakes, or what does "stacking" mean? 
Binder needs to find the likely origin, in space and time, of a group of 
P-wave arrivals. Given a single P-wave arrival (call it the "initiating pick"), 
binder considers the entire volume of the earth is a potential hypocenter. 
The addition of a second P-wave arrival gives binder information on relative 
timing; so it can narrow the field of possible hypocenters to a 3-dimensional 
surface. To determine this surface, binder essentially takes the P-wave 
travel-time curve, puts the curve's origin at the station's location, flips 
the curve upside down (about a horizontal axis), shifts it in depth 
considering the relative timing between the two picks, and spins the curve 
around a vertical axis.  This creates a conic-like surface (somewhat like a 
hollyhock flower opening down) that passes through all possible hypocenters 
which could have produced that second arrival, given the timing of the first. 
Binder calculates the "hollyhock" of potential hypocenters for each subsequent 
pick in the same manner, always using to the initiating pick's time as the 
reference time.  When the "hollyhocks" from two arrivals overlap, you get a 
line of potential hypocenters that is consistent with both picks and the 
initiating pick. The point where three or more individual "hollyhocks" 
intersect is a unique hypocenter (stacking location) that can explain all 
of those arrivals, including the initiating pick.  At this stage, with four 
P-wave arrival times, binder has successfully "stacked" a new "active event."  
It associates (or binds) these four picks into an event and passes those 
picks and the stacking location to its simple L1 locator for further 
processing.

  Instead of considering an infinite number of possible hypocenters, binder
discretizes the problem by using a stacking grid.  Within its boundaries,
the grid contains uniformly-sized cubic cells.  Only the center point of 
each grid cell is considered a possible hypocenter in the stacking process 
described above. Below I'll describe the parameters that are used to define 
the stacking grid and control the stacking process.

  While setting up the stacking grid, I find it very useful to look at a 
map of the seismic network and historic seismicity in the region of interest.
The things I look for are:

  + seismic network: overall distribution & station spacing
  + seismicity: latitude, longitude & depth ranges


Stacking Grid Boundaries:
The boundaries of the stacking grid are set with three commands: 
  "grdlat" (latitude range in decimal degrees, positive = north), 
  "grdlon" (longitude range in decimal degrees, positive = east), 
  "grdz"   (depth range in kilometers, positive = down).  
Reviewing the map of historic seismicity may help in setting the stacking 
grid boundaries.  Binder can only identify new hypocenters that lie within  
these boundaries, so they should include the entire volume of interest (ie, 
wherever you expect or hope to see earthquakes).  Also, all of your seismic 
stations should lie within the grid boundaries.

Stacking Grid Cell Size:
  Now you must decide how big each cell within the grid will be. Each cell
is a cube, and its side-length (in kilometers) is set with the "dspace" 
command. That map may be useful again; hopefully, you can set dspace so that
only one station falls within each grid cell. The more stations that fall 
within one cell, the more trouble binder will have in "focusing" the stack 
(when 2 stations are in the same grid cell, their "hollyhocks" will overlap 
100%, which doesn't do much good for resolving potential hypocenters). Then 
again, you don't want to make the cell size too small, or binder will have 
a lot more computations to do.  So there's a tradeoff here.  FYI, the 
Northern California Seismic Network uses a grid cell size of 4 km.

Stacking Distance Constraints:
  When stacking each additional pick, binder considers possible hypocenters 
for that arrival within a limited distance of the station, not the entire 
stacking grid.  The stacking distance is controlled by the "rstack" command.  
Binder computes each pick's "hollyhock" within its stacking volume. The
stacking volume includes the entire depth range of the stacking grid, but in
map view the stacking volume is a square centered on the station location, 
extending rstack kilometers in each horizontal direction (the square's sides 
are 2*rstack km long). To avoid roundoff problems, rstack should be a multiple 
of the grid-cell dimension, dspace.  
  Once again, the network map will be handy for choosing an appropriate rstack. 
Consider your network's station spacing and distribution - phases from stations 
that are more than 2*rstack km apart will not stack together other because 
their stacking volumes will not overlap at all.  Also consider your expected
seismicity patterns - binder can only identify new hypocenters where the 
stacking volumes of 4 or more stations overlap.  Make sure that your expected 
seismicity falls within such a region, increasing rstack if necessary.  After
an event is stacked, binder's L1 locator may move the hypocenter beyond the
stacking volume, and possibly even outside the stacking grid boundaries.
But, you have to stack it first!

Stacking and Time Resolution:
  Binder includes a time-resolution factor ("tstack" command) in calculating 
the set of all possible hypocenters ("the hollyhock") for each pick.  This 
time factor controls the thickness of the hollyhock petals. Increasing tstack 
could increase the number of grid cells each pick's hollyhock. As a first 
approximation, tstack should be at least the number of seconds it takes for 
a P-wave to cross a grid cell of dspace km.  

Pre-Stack Glitch Filtering:
  A glitch is a group of coincident arrivals often caused by noisy telemetry.  
For the purposes of this filter, a glitch is defined as a certain number of 
picks (m) occurring in a given number of seconds (x). The values of both m 
and x are configurable with the "define_glitch" command (the default definition 
is 4 picks within 0.035 seconds).  This glitch filter was added to the stacking 
portion of binder to prevent it from creating new events from a group of picks 
having nearly identical arrival times. Picks that are flagged as belonging to 
a "glitch" are not used in summing the stack, but may later be associated with 
an event.  If you want to turn off pre-stack glitch filtering (ie, to make 
binder stack every pick), set m to zero (example "define_glitch 0 0.0"). 

Summing the Stack - Stacking Weights & Event Thresholds:
  OK, here we go. Binder gets a new pick and it doesn't associate with any
of its currently active hypocenters.  Binder will now attempt a stack with
that new pick as the "initiating pick".  First it gathers up all recent
unassociated picks (binder keeps track of the last 1000 picks it has 
processed, but it only tries to stack with a limited number, set with the
"stack" command).  Then it reviews that list of unassociated picks to see
if there are any glitches (see section above). Any picks that belong to a 
glitch will be eliminated from the stacking process. If the initiating pick 
itself belongs to a glitch, binder will not attempt a stack at all.
  Now let's talk about stacking weights. The stacking weight of each
pick is based on the quality assigned to that pick by the pick_ew process. 
By default, binder stacks only quality=0 picks with a weight=1. Using "grid_wt" 
commands, you can include other quality picks in the stacking process. You could 
also give high-quality picks greater weight in the stack than low-quality 
picks. This differential weighting is especially useful for large networks like 
NCSN; we want to avoid generating events based on a few low-quality picks, but 
will accept an event that's based on a larger number of low-quality picks (or 
a combination of high-and low-quality picks). For small networks (10s of stations), 
I would suggest using equal stacking weight for all quality picks.  Just be
sure to use a "grid_wt" command for each pick quality (including 0) that you 
want binder to include in the stack. 
  The "grid_wt_instid" is a new twist (June 2002) to the stack weighting scheme.
It allows you to set the stacking weight of a pick based on its pick quality 
and its installation_id.  For example, this allows a user to weight a quality 
2 pick from one installation differently than a quality 2 pick from a different 
installation.  This is only useful for networks that exchange pick data with 
other institutions. 
  To sum the stack, each grid cell is assigned a "hit count" to keep track of 
how many picks that cell is a potential hypocenter for (how many "hollyhocks" 
that cell is a member of). During the stack, binder only keeps track of the 
grid cells within the initiating pick's stacking volume. At the beginning of a
stack, the hit count of every cell in the initiating pick's stacking volume is
set to initiating pick's stack weight.  Then binder calculates the hollyhock
for each additional pick; for each grid cell that lies within the new pick's 
hollyhock and the initiating pick's stacking volume, the hit count in 
incremented by the stack weight of the new pick.
  After binder sums the stack using all picks in its short list, it reviews 
the hit counts of all the grid cells. Binder prepares a list of cells that 
share the maximum hit value and the same minimum distance to the initiating 
phase. Binder declares a new hypocenter from the stack only if the maximum 
hit count is greater than or equal to an event threshold value (set with the 
"thresh" command) and the number of cells in the list is less than an
event focus value (set with the "focus" command). The trial hypocenter for 
the declared event is calculated by averaging the center positions of all 
the cells in the maximum-hit-count list.  Each pick that contributed to that
maximum hit count is associated with the new event.
  It's very important that at least 4 phases contribute to declaration of a
newly stacked earthquake.  If all picks are being weighted equally (no 
"grid_wt" commands were used), the event threshold ("thresh" command)should 
be set to 4 or higher. If differential stacking weights are being used, the 
event threshold should be greater than 3 times the maximum stacking weight 
specified by "grid_wt" commands. 
  Finding the optimum value for the event focus value ("focus" command) 
may require considerable experimentation for any specific network. The 
optimal value will depend somewhat on the stacking grid granularity ("dspace" 
command). Since the trial hypocenter is calculated by averaging the center 
positions of all the cells in the maximum-count list, a larger focus threshold
will increase the probability of a poor starting location (which has its own 
attendant problems). A smaller focus value, however, will tend to reject 
distant events. 
       
Stacking and Eventids:
  Every time binder identifies a new event with a stack, it assigns
an eventid to it.  Binder uses this eventid in its internal bookkeeping 
to know which picks are associated with which events.  It also writes
the eventid in the TYPE_LINK and TYPE_QUAKE2K messages that it outputs,
so that downstream modules (eqproc, eqprelim) can also do their own 
bookkeeping. 
  Binder keeps track of the next available eventid in a file called 
"quake_id.d" that is written in the EW_PARAMS directory.  The actual 
command written to the file is the "next_id quakeid" command. This file 
is read once on startup (actually, the first eventid used after startup 
is quakeid+1), and is updated after every new event is stacked.  To ensure 
sequential, non-duplicated eventids between runs, binder's configuration 
file must contain the command "@quake_id.d". If this command is omitted, 
event sequence numbers will be issued beginning at 1 each time binder 
is restarted.  
  You should never see the same eventid twice, but you may see gaps. 
On startup, binder always skips one eventid (just in case it died in the
middle of writing the quake_id.d file), and it may kill events for various
reasons.  Also, the eventids may appear downstream out of numerical order. 
Sometimes a smaller event that occured after (thus had a higher eventid) 
a large one will finish first.


Associating New Phases with Active Earthquakes
----------------------------------------------
  Whenever binder retreives a new pick, the first thing it does is try
to associate the arrival time with one of its currently active earthquakes.
Even though the Earthworm pick_ew module supposedly only picks P-wave 
arrivals, binder assumes the pick is an unknown phase during the association 
process. Knowing each event's origin time and location (binder keeps track of 
the most recent 100 hypocenters), the station location, and the velocity 
structure, binder calculates the arrival times of each event's possible 
phases (P, Pg, Pn, S, Sg, Sn) at that site. Then it compares the actual 
arrival time to the calculated times to find the residuals. Any phase with 
a residual within the configured residual-tolerance qualifies that pick to 
be associated with that event as that phase. The phase having the lowest 
residual value (within the residual-tolerance) wins!  Binder associates
or links the pick with the event with the winning phase, then it hands that
event to the L1 locator for further processing.  If no phase has a residual
within the residual-tolerance, binder performs a stack with the new pick as 
the initiating pick (see stacking section above).

Association and Time Constraints:
  As I mentioned above, binder keeps track of the 100 most recent hypocenters.
When it tries to associate a new pick, it doesn't actually consider all of 
those hypocenters as possible sources for the pick.  Instead, it considers
only those hypocenters where the pick time is within a given time range of
the origin time.  The "t_dif tmin tmax" command defines the acceptable time
range (tmin, tmax are in seconds). If the pick time is earlier than 
(origintime + tmin) or later than (origintime + tmax) for a given event, 
binder will not attempt to associate that pick with that event. 

Association and Distance Constraints:
   Binder keeps track of the average epicentral distance (rAvg) of all phases 
associated with each active hypocenter in its list. Before binder attempts 
to associate a new pick with a given hypocenter, it compares the epicentral 
distance of that station with a cutoff distance based on the rAvg value for
that hypocenter. If the new pick's epicentral distance is greater than the 
cutoff distance, binder will not attempt to associate the new pick with 
that hypocenter. The cutoff distance is calculated by rAvg*rAvg_Factor, 
where rAvg_Factor is a constant (default value = 10.0) that can be set with 
the "rAvg_Factor" command.  This distance cutoff is important in cases where
two earthquakes occur with seconds of one another in different parts of the 
network.  It also keeps an event from being contaminated by distant spurious 
noise picks which happen to match the event's travel time curve.

Configuring the Travel-time Residual Tolerance Function:
  Whether or not a given phase is associated with, culled from, or scavenged
for a given event is determined by how closely its arrival time fits the
predicted arrival time for the event (ie, how large its travel-time residual
is). The overall maximum-allowed travel-time residual (residual tolerance
or restpr) for a pick to be associated with a given active hypocenter is 
composed of three terms:  

   restpr = dist-taper + res1_OT + res2_OT

The first term (dist-taper) is based on the distance of the current pick to 
the hypocenter (controlled with the "taper" commands) and two remaining 
terms (res1_OT and res2_OT, controlled with the "taper_OT" command) are based 
on the quality of the event's location.  All of these terms are configurable
to allow you to widen the residual tolerance for more distant picks and/or
for more poorly located earthquakes.

Residual Tolerance Distance Term, dist-taper:
  The distance-dependent term of residual tolerance, dist-taper, is controlled 
with the "taper r resmax" command, where r is the epicentral distance (km) and 
resmax is the value of the dist-taper term (seconds). Multiple taper commands 
(up to 100) can be used, in order of increasing r, to define the dist-taper 
term of the tolerance function. Between the distances specified, the value of 
the dist-taper term changes linearly between corresponding resmax values. For 
distances less than the lowest value specified, the dist-taper term is constant 
at the resmax value for the smallest distance; for distances exceeding the 
maximum specified, the dist-taper term is constant at the resmax value set 
for the maximum distance. If no "taper" commands are supplied, the dist-taper 
term is a constant 1.0 sec for all distances. 
  The dist-taper function defined by the "taper" commands also plays a role
in how a pick is weighted in the relocation process (see the distance-weight
factor in the "Relocation and Pick Weighting" section below).

Residual Tolerance Hypocentral Quality Terms, res1_OT and res2_OT:
  The "taper_OT OTconst1 OTconst2" sets two constants that are used to 
calculate the hypocenter-quality terms of the maximum-allowed travel-time 
residual, restpr. The thinking is, if your location is poor, you'll want to 
accept a phase with a higher travel-time residual.  If no taper_OT command
is listed, the default values are OTconst1 = OTconst2 = 2.0.  To eliminate
the two hypocentral quality terms from the residual tolerance function, 
set both constants to 0.0.

  The res1_OT term is based on event's rms residual and the number of phases 
associated with the event.  Binder uses the first constant, OTconst1, in 
calculating res1_OT for each hypocenter as follows:

   res1_OT = OTconst1 * rms * Npick_factor 

   where  Npick_factor = 3.0                     when npick <= 4
                       = SQRT( npick/(npick-4) ) when npick  > 4

When the number of picks is small, the rms of the location is often quite
low, so the Npick_factor and OTconst1 will dominate the res1_OT term.
As the number of picks associated with the event increases, the
Npick_factor approaches 1, so the event rms and OTconst1 will dominate 
the res1_OT term.

  The res2_OT term is based on the hypocentral distance to the nearest 
station and the second constant, OTconst2.  Given the hypocentral (slant) 
distance to the nearest station calculated by:

   s = SQRT( z**2 + dmin**2 )  where    z = hypocentral depth
                                     dmin = epicentral distance to
                                            nearest station

binder calculates the res2_OT tolerance term as follows:

   res2_OT = 0.0                     when s  < 1.0
           = OTconst2 * log10( s )   when s >= 1.0

The larger the minimum slant distance, the more unreliable the location,
and the larger this second residual taper term becomes. 

  OK, so that describes the form of all three terms in the residual tolerance
function, but it doesn't really give you a clue as to what values to use for 
your network. I think it's most important to set a reasonable distance taper 
for your network based on how well your velocity model fits your arrival-time
data.  Hopefully, you already have some idea about that. The residual tolerance 
can then be widened if necessary using the origin uncertainty terms. The 
values of all three terms are printed in binder's logfile, so you may be 
able to empirically settle on some parameters by studying those.


Binder's Simple Event Locator
-----------------------------
  After binder declares a new event from a stack, or after is associates
a new pick with an active event, it passes the event to its simple L1 locator
for further processing.  The locator takes a starting location and all the
phase data for the event, determines a weight factor for each phase in the 
relocation, and performs a number of iterations to arrive at a new location. 
Again, there are quite a few configurable parameters available to control the 
locator, and I'll describe them below.

Intermittent Relocation:
  While binder's locator is simple, it can still be quite a CPU hog, especially
when a large number of phases are associated with the event.  Also, as the
number of associated phases grows, the hypocenter becomes more stable, and the 
improvement in the location with each additional phase shrinks. Therefore, 
we added the configuration command "locate_eq npick interval" to allow 
intermittent relocation of an event. Instead of relocating after every single 
new phase is associated, an earthquake with more than "npick" picks will 
relocate with every additional "interval" picks associated. Up to 10 "locate_eq" 
commands can be supplied (in order of increasing "npick" values) to set multiple 
npick-break-points and location intervals.  An event with fewer picks than 
the smallest "npick" value will locate after each additional pick is associated.
If no "locate_eq" commands are used, every event will be relocated after
each addition pick is associated. 

Relocation and Pick Weighting - Phase Type, Pick Quality, and Distance:
  The overall weight of a pick in the location process is the product of three 
weighting factors: a phase-weight, a pick-quality-weight and a distance-weight.

     wt = phase-weight * pick-quality-weight * distance-weight

  The phase-weight is based on the phase assigned to the pick by binder, in 
conjuction with the "ph" commands listed in the configuration file. Use 
one "ph phase weight" command to assign a "weight" (any number from 0.0 to
1.0, inclusive) for each "phase" (any of the strings P,S,Pg,Sg,Pn,Pg) that 
you want binder to use in the location.  If at least one "ph" command is 
supplied, other phases that were not listed in a "ph" command receive a 
phase-weight of 0.0 (ie, they won't be weighted in the location process).  
If no "ph" commands are used, all phases receive a phase-weight of 1.0.
  The pick-quality-weight is based on the pick-quality assigned by pick_ew,
in conjuction with the "wt" commands listed in binder's configuration file.
Use one "wt quality weight" command to assign a "weight" (any number from 0.0
to 1.0, inclusive) to any "quality" (0,1,2,3 or 4) of pick that you want
binder to use in the location.  If at least one "wt" command is supplied, 
other pick-qualities that were not listed in a "wt" command receive a pick-
quality-weight of 0.0 (ie, they won't be weighted in the location process).
If no "wt" commands are used, all quality picks receive a pick-quality-weight 
of 1.0.
  The distance-weight is based on the epicentral distance of the station
from the current starting location and the "r_max" and "taper" commands 
listed in the configuration file.  The "r_max" command is used to set the
maximum epicentral distance for weighting picks in the location process.
Any pick at a distance greater than r_max km from the starting epicenter
will not be used in relocating the event.  For picks within r_max km of the
epicenter, the distance-weight is a function of the current distance, r, and
the distance-taper function set with the "taper" commands (see the "Residual 
Tolerance Distance Term, dist-taper" section above):

  distance-weight = ((min dist-taper value)/(dist-taper value at r))**2

  The overall weight of a pick in the relocation process will be a number 
between 0.0 and 1.0, inclusive. 
  The total number of phases binder can use in relocating an event is 
limited using the "maxpix" command.  The default value (256) should be 
plenty for most networks.

Relocation and Iteration Control:
  Binder hands its locator a starting location and a list of phases. The
locator determines the pick weights as above, then calculates the event's
average weighted travel-time residual (rms) to use as a reference for
testing divergence.
  For most earthquakes, the first iteration performed by the locator will 
be a "Free" solution. It will invert for all four hypocentral parameters: 
latitude, longitude, depth, and origin time. However, if an event has no 
nearby picks associated, the locator will perform "Fixed Depth" solution 
instead. This feature is controlled with the "FixDepth dmin trialz" 
command (default dmin=50.0, trialz=8.0). If an earthquake has no picks 
closer than "dmin" km, binder sets its depth to "trialz" km, then inverts 
for only latitude, longitude, and origin time. This prevents events with 
no depth resolution from flailing.  
  The result of the locator's inversion is the "step-vector." This is the 
amount by which the iteration will change all 4 hypocentral components.
Binder places limits on the horizontal (xystep) and vertical (zstep) 
components of the step-vector in any given iteration. Those limits are set 
with the "MaxStep xystep zstep" command (default xystep=10.0 km, zstep=2.0 km).
If either step-length is exceeded in an interation, all 4 dimensions 
(x, y, z and time) of the step-vector are equally damped, preserving the 
direction of the vector, such that each dimension is within its configured
limit. 
  Binder also place limits on the valid range for computed hypocentral depth. 
This range, specified with the "zrange zmin zmax" command (default range 
is 2.0 - 20.0 km), should span the depth range in which you expect to detect 
earthquakes. It must also span the entire depth range of the stacking 
grid ("grdz" command); otherwise, events may stack outside of the zrange 
limit and this could cause locator logic errors. If during an iteration, 
the step-vector causes the depth to fall outside of the zmin-zmax range, 
that step vector is abandoned. The locator then "restarts" the iteration;
beginning with the previous starting hypocenter, the locator performs a 
fixed-depth solution.  This feature is useful in controlling locations 
from network "glitches" where all picks are coincident and the apparent 
high phase velocity would try to locate the quake at the center of the Earth. 
  At the end of each iteration, binder recalculates the event's average
weighted residual (rms) and performs a test for divergence.  If the new 
rms is greater than the previous rms*"constant" (set with "MaxDeltaRms"
command, default = 1.01), then the solution is diverging. If the location
is converging, the locator goes on to the next iteration. If a location
is diverging, binder removes half of the step-vector in all 4 dimensions 
(x, y, z and time), backing the location up toward the starting location
from that iteration, and then checks the rms again. If the solution now
converges, the locator goes to the next iteration. If the solution is 
still diverging, it backs the location up farther. Up to 3 backwards steps
will be made (resulting in a step-vector of 1/8 the original size) if 
the solution continues to diverge.
  Binder's locator will perform a maximum number of iterations every time
that it's called. The maximum iteration count can be set with the "MaxIter"
command (default = 4). However, the locator may perform fewer than MaxIter 
iterations if the change in the location (in X-Y-Z km) in any given 
iteration is less than a minimum step-length (set with the "MinXYZstep" 
command; default = 0.1 km).
  

Post-Relocation Event Review:
Pick Culling, Pick Scavenging, and Killing Events:
--------------------------------------------------
  Binder has just relocated an earthquake. Presumably, the location
is better contrained that it was before, and the arrival time fits
may have changed as well.  Binder now recalculates the travel-time 
residuals for all phases associated with the event, regardless of
whether the phase was weighted in the relocation or not.  It also
calculates the event's average weighted residual (rms), and performs
a series of tests.

Pick Culling:
  For each associated pick, binder compares the new travel-time 
residual to the new residual tolerance value for that station (see
the "Configuring the Travel-time Residual Tolerance Function" section 
above).  If the new residual is larger than the tolerance, binder
deassociates or culls the phase from the earthquake.  That pick is
now free to associate with another event, or to stack with other picks 
to identify a new earthquake.  If any picks are culled from an event, 
that event is sent back to the locator once again.

Pick Scavenging:
  After an event is relocated, binder also looks through its list 
of recent picks to see if there are any others which may fit the
travel-time curves of the new hypocenter.  Essentially, it does
the association process (comparing the travel-time residuals of 
various phases to the residual tolerance at that station) on picks 
that are in already its memory; binder is scavenging.  Binder is 
allowed to scavenge two classes of picks in its list: those that are 
not associated with any active event (also known as "waifs") and those 
that are associated with events that have fewer phases than the 
scavenging event.  So big events are allowed to steal phases from 
smaller events, as long as the travel-time residuals fit.  Since a pick
can only belong to one event at a time, this scavenging feature makes 
it possible for two events to merge into one.  This is useful in 
avoiding "split events", where a large event is originally identified 
as two events due to the order in which its arrivals were processed.
  If an event successfully scavenges one or more new phases, the 
event is sent back to the locator once again.  Also, if any of the
scavenged phases had been associated with a second event, that 
second event is also sent back to the locator.

Killing Events:
  In reviewing events, binder sometimes decides that an event is
no longer valid, either because its location parameters are poor,
or because it doesn't have enough supporting phases.  If after a 
relocation, an event's rms is larger than a cutoff value (configured
with the "rmscut" command, default = 1.0 sec), binder marks that
event as destroyed or killed.  Or, if after a round of scavenging,
an event has fewer than 4 weighted phases (too few to relocate the
event), binder kills that event as well.
  When binder kills an event, it deassociates or unlinks all of 
its phases. Those picks are then free to associate with other active 
events as waifs, or to stack with other picks to identify new events.  
To notify downstream modules that the event was killed, binder issues 
a TYPE_QUAKE2K message for that eventid, setting the "number of 
associated phases" field to zero.


Quality Control: Pick-group Assessment 
---------------------------------------
  Occassionally one of binder's newly stacked events will include an
outlying pick which falls at a "leverage point" for the hypocenter
produced by binder's locator.  This bad pick has undue influence 
over the hypocenter, causing a poor location which in turn leads 
to the mis-association of other picks.  Binder uses a test called
"pick-group assessment" to identify and eliminate these outlying
picks, and to produce a trial hypocenter for the locator.  Given a 
group of supposedly related picks, pick-group assessment tests whether 
all of the arrivals are indeed consistent with a single earthquake.
  Pick-group assessment is most efficient when it is performed on an
event with relatively few associated phases. The configuration 
command "assess_pk minpk maxpk" controls when the test is run; when
an event has between minpk and maxpk (inclusive) picks associated with 
it, its pick set will be assessed before the event is handed to the
locator (defaults are minpk=6, maxpk=12). If an event stacks with more 
than maxpk picks, its pick set will be assessed at least once.  To 
turn off pick-assessing totally, set minpk and maxpk to zero. 
   Only P-wave arrivals are used in pick-group assessment. First, the 
set of picks is sorted in increasing time order and is examined for 
glitches.  Glitch picks are eliminated from further testing (see the 
"define_glitch" command described in the "Pre-Stack Glitch Filtering" 
section above).  The remainder of the test involves a number of 
resampling trials.  In each trial, four picks are chosen from the 
pick set and the exact four-station hypocenter is calculated using a 
uniform velocity halfspace (configured with the "v_halfspace" command, 
default = 5.0 km/s). Next, binder calculates the travel-time residual 
with respect to that hypocenter for every pick in the pick set (again 
using the uniform velocity halfspace).  You can specify the maximum 
pick quality (assigned by pick_ew) to use in this resampling process 
with the "maxwt" command (valid values are 0-4, default = 3).  Picks 
with qualities higher than maxwt (poorer quality) may be rejected as 
glitches, but will not be used in resampling trials.
   The number of trials used in the pick-group assessment is controlled 
with the "maxtrial" command (default = 500).  Given n picks sampled 4 
at a time, the total number of unique combinations is n!/(4!*(n-4)!). 
If this number is less than maxtrial, all possible combinations will be 
used, with no duplication. If the total is greater than maxtrial, then 
maxtrial random samples of 4 picks will be taken, and duplication is 
possible.
  Statistics on the residuals from all of the trials are used to 
determine if each pick should be accepted or rejected. If a pick is 
an outlier, it will have very large residuals for any of the hypocenters 
determined from 4 other picks. Also, the scatter in its residuals 
will be large.  For each pick, binder calculates the median residual (med), 
its absolute value (absmed), and the median of the absolute deviations 
from the median residual (mad). If either the absmed or mad is greater 
than a configured cutoff threshold, the pick is rejected as an outlier.  
The cutoff values are configured with the "residual_cut maxabsmed maxmad" 
command (default maxabsmed = 3.0 sec, maxmad = 10.0 sec).
  Pick-group assessment is a multiple pass process. The first pass samples 
from the earliest (up to 25) P-phase picks which meet the testing criteria; 
this pass is used to winnow the data to arrivals with median residuals 
below a cutoff value. A second pass resamples the winnowed data to 
determine the new starting hypocenter from the average of all the 4-pick 
hypocenters.  This new starting hypocenter and the winnowed list of picks
are handed to binder's locator for further processing.  
  If any picks were rejected as glitches or outliers by the pick-group 
assessment, they are deassociated from the event and are free to restack 
or reassociate.  Occassionally, binder rejects enough picks that the
event no longer has enough valid phases to relocate it.  In this case,
the event is killed, and all of its picks are deassociated.


Examples from Binder's Log File
-------------------------------
1. Stack: five picks stack to create a new event.

2. Glitch Detection in Stacking Phase: a series of picks stack, until 
   the last pick identifies the group as a glitch.

3. Association: two examples: a pick associates with an active event and a 
   pick fails to associate or stack.



Trouble-Shooting
----------------
Binder's not producing any earthquakes - what could be wrong?

1. Is it seeing any picks?  
   Check the log file, you should see some lines like these:
 10  6  3 2189 MWB  NCVHZ D2  19990930005310.08     431     672     584
19990930_UTC_00:53:15 grid_stack, mhit = 0
   The first line is a TYPE_PICK2K message, the second is the log of an
   stacking attempt.

2. Is the velocity model specified with "lay depth velocity"?

3. Is the station file in the correct format?  
   All lats & lons should be positive in degrees & minutes, 
   with N,S,E,W denoted by letters. If there are no letters, 
   N and W are assumed.

4. Are the stacking grid boundaries set properly?
   Lats & lons are in decimal degree with positive = N, E.



Module Index | Binder_ew Commands

Questions? Issues? Subscribe to the Earthworm List (earthw).