Notes on Configuring BINDER_EW: Earthworm's Phase Associator
by Lynn Dietz, July 2002
Given a list of P-wave arrival times, a set of station locations and a
simple velocity model, binder's job is to identify the smallest set of
hypocenters that could have produced the current list of arrival times.
With each new pick, binder tries to:
1) associate that arrival to an active event, or
2) identify a new event by stacking the back-projections of P-wave
travel times on a space-time grid to find a likely source (hypocenter)
for recent unassociated picks.
Sounds simple, right? Well, there are a lot of configurable parameters
that control exactly how binder will go about this task. I'll try
to explain the basic steps I go thru when I'm setting up binder for the
first time in a new region of interest. And I'll give a few details
about how binder goes about its tasks (refer to this flowchart as we go).
Getting Started: Earthworm Setup & Network Definition
-----------------------------------------------------
The three most basic things binder needs to know are:
1. Where should it get P-wave arrivals (picks) to work on?
2. What stations are being picked and where are they?
3. What's the velocity structure like?
1. P-wave arrivals:
Two commands in binder's configuration file tell it where it should get
the P-wave arrivals to work on. The "RingName" command specifies which
Earthworm transport ring (a shared memory region) to read from. Make sure
that this is the same ring into which pick_ew is writing its output. Also,
if you're importing picks from another source and want binder to use them,
make sure you are placing them in this ring as well.
The "GetPicksFrom" command specifies the installation id and module id of
the TYPE_PICK2K messages that binder will operate on. Specific id's or
wildcards can be used for either or both of these arguments. Note: both
binder_ew and eqproc should be set to listen to the same pick source(s)!
2. Station List:
Binder expects the network's station list to be in a file in Hypoinverse
station format; it gets the name of the file from the "site_file" command.
Of all the fields in this file, the only ones used by binder are the
station codes (5-letter site code, 2-letter network code, 3-letter component
code) and the station latitude and longitude. A quirk of this file is that
all lats and lons are listed as positive numbers; a letter is used to denote
N,S,E,W, with blanks interpreted as N and W. As binder reads the file, it
converts lat and lon to decimal degrees with N and E being positive. If a
site has multiple components, each component should be listed on its own line
in the station file. As of Earthworm v4.0, binder treats each component
independently, so if multiple components at a site are picked, that site
will be weighted more heavily in stacks and locations than a single-component
site.
Large networks (over 1000 channels) will have to use the "maxsite"
command to increase the number of stations that binder will input and use.
Since binder works only on arrival times, it doesn't need to know anything
about instrument response or gain.
3. Velocity Model:
Binder uses one simple velocity model to calculate travel times for the
entire region of interest. If the velocity structure varies greatly in
your region, make sure you give binder a decent average model. This
velocity model is specified with a series of "lay" commands in the
configuration file, listed in order of increasing depth. The velocities
in each layer are constant, and should increase with depth (no low velocity
zones, please). Pay attention to the order of the arguments of the
lay command. It should be "lay depth(km) velocity(km/s)". If you get
depth and velocity swapped, binder won't do much of anything.
Binder also uses one P/S velocity ratio; the default is 1.72, but it can
be changed by using the "psratio" command in the configuration file.
Stacking - The Guts of Binder
-----------------------------
How does binder identify new earthquakes, or what does "stacking" mean?
Binder needs to find the likely origin, in space and time, of a group of
P-wave arrivals. Given a single P-wave arrival (call it the "initiating pick"),
binder considers the entire volume of the earth is a potential hypocenter.
The addition of a second P-wave arrival gives binder information on relative
timing; so it can narrow the field of possible hypocenters to a 3-dimensional
surface. To determine this surface, binder essentially takes the P-wave
travel-time curve, puts the curve's origin at the station's location, flips
the curve upside down (about a horizontal axis), shifts it in depth
considering the relative timing between the two picks, and spins the curve
around a vertical axis. This creates a conic-like surface (somewhat like a
hollyhock flower opening down) that passes through all possible hypocenters
which could have produced that second arrival, given the timing of the first.
Binder calculates the "hollyhock" of potential hypocenters for each subsequent
pick in the same manner, always using to the initiating pick's time as the
reference time. When the "hollyhocks" from two arrivals overlap, you get a
line of potential hypocenters that is consistent with both picks and the
initiating pick. The point where three or more individual "hollyhocks"
intersect is a unique hypocenter (stacking location) that can explain all
of those arrivals, including the initiating pick. At this stage, with four
P-wave arrival times, binder has successfully "stacked" a new "active event."
It associates (or binds) these four picks into an event and passes those
picks and the stacking location to its simple L1 locator for further
processing.
Instead of considering an infinite number of possible hypocenters, binder
discretizes the problem by using a stacking grid. Within its boundaries,
the grid contains uniformly-sized cubic cells. Only the center point of
each grid cell is considered a possible hypocenter in the stacking process
described above. Below I'll describe the parameters that are used to define
the stacking grid and control the stacking process.
While setting up the stacking grid, I find it very useful to look at a
map of the seismic network and historic seismicity in the region of interest.
The things I look for are:
+ seismic network: overall distribution & station spacing
+ seismicity: latitude, longitude & depth ranges
Stacking Grid Boundaries:
The boundaries of the stacking grid are set with three commands:
"grdlat" (latitude range in decimal degrees, positive = north),
"grdlon" (longitude range in decimal degrees, positive = east),
"grdz" (depth range in kilometers, positive = down).
Reviewing the map of historic seismicity may help in setting the stacking
grid boundaries. Binder can only identify new hypocenters that lie within
these boundaries, so they should include the entire volume of interest (ie,
wherever you expect or hope to see earthquakes). Also, all of your seismic
stations should lie within the grid boundaries.
Stacking Grid Cell Size:
Now you must decide how big each cell within the grid will be. Each cell
is a cube, and its side-length (in kilometers) is set with the "dspace"
command. That map may be useful again; hopefully, you can set dspace so that
only one station falls within each grid cell. The more stations that fall
within one cell, the more trouble binder will have in "focusing" the stack
(when 2 stations are in the same grid cell, their "hollyhocks" will overlap
100%, which doesn't do much good for resolving potential hypocenters). Then
again, you don't want to make the cell size too small, or binder will have
a lot more computations to do. So there's a tradeoff here. FYI, the
Northern California Seismic Network uses a grid cell size of 4 km.
Stacking Distance Constraints:
When stacking each additional pick, binder considers possible hypocenters
for that arrival within a limited distance of the station, not the entire
stacking grid. The stacking distance is controlled by the "rstack" command.
Binder computes each pick's "hollyhock" within its stacking volume. The
stacking volume includes the entire depth range of the stacking grid, but in
map view the stacking volume is a square centered on the station location,
extending rstack kilometers in each horizontal direction (the square's sides
are 2*rstack km long). To avoid roundoff problems, rstack should be a multiple
of the grid-cell dimension, dspace.
Once again, the network map will be handy for choosing an appropriate rstack.
Consider your network's station spacing and distribution - phases from stations
that are more than 2*rstack km apart will not stack together other because
their stacking volumes will not overlap at all. Also consider your expected
seismicity patterns - binder can only identify new hypocenters where the
stacking volumes of 4 or more stations overlap. Make sure that your expected
seismicity falls within such a region, increasing rstack if necessary. After
an event is stacked, binder's L1 locator may move the hypocenter beyond the
stacking volume, and possibly even outside the stacking grid boundaries.
But, you have to stack it first!
Stacking and Time Resolution:
Binder includes a time-resolution factor ("tstack" command) in calculating
the set of all possible hypocenters ("the hollyhock") for each pick. This
time factor controls the thickness of the hollyhock petals. Increasing tstack
could increase the number of grid cells each pick's hollyhock. As a first
approximation, tstack should be at least the number of seconds it takes for
a P-wave to cross a grid cell of dspace km.
Pre-Stack Glitch Filtering:
A glitch is a group of coincident arrivals often caused by noisy telemetry.
For the purposes of this filter, a glitch is defined as a certain number of
picks (m) occurring in a given number of seconds (x). The values of both m
and x are configurable with the "define_glitch" command (the default definition
is 4 picks within 0.035 seconds). This glitch filter was added to the stacking
portion of binder to prevent it from creating new events from a group of picks
having nearly identical arrival times. Picks that are flagged as belonging to
a "glitch" are not used in summing the stack, but may later be associated with
an event. If you want to turn off pre-stack glitch filtering (ie, to make
binder stack every pick), set m to zero (example "define_glitch 0 0.0").
Summing the Stack - Stacking Weights & Event Thresholds:
OK, here we go. Binder gets a new pick and it doesn't associate with any
of its currently active hypocenters. Binder will now attempt a stack with
that new pick as the "initiating pick". First it gathers up all recent
unassociated picks (binder keeps track of the last 1000 picks it has
processed, but it only tries to stack with a limited number, set with the
"stack" command). Then it reviews that list of unassociated picks to see
if there are any glitches (see section above). Any picks that belong to a
glitch will be eliminated from the stacking process. If the initiating pick
itself belongs to a glitch, binder will not attempt a stack at all.
Now let's talk about stacking weights. The stacking weight of each
pick is based on the quality assigned to that pick by the pick_ew process.
By default, binder stacks only quality=0 picks with a weight=1. Using "grid_wt"
commands, you can include other quality picks in the stacking process. You could
also give high-quality picks greater weight in the stack than low-quality
picks. This differential weighting is especially useful for large networks like
NCSN; we want to avoid generating events based on a few low-quality picks, but
will accept an event that's based on a larger number of low-quality picks (or
a combination of high-and low-quality picks). For small networks (10s of stations),
I would suggest using equal stacking weight for all quality picks. Just be
sure to use a "grid_wt" command for each pick quality (including 0) that you
want binder to include in the stack.
The "grid_wt_instid" is a new twist (June 2002) to the stack weighting scheme.
It allows you to set the stacking weight of a pick based on its pick quality
and its installation_id. For example, this allows a user to weight a quality
2 pick from one installation differently than a quality 2 pick from a different
installation. This is only useful for networks that exchange pick data with
other institutions.
To sum the stack, each grid cell is assigned a "hit count" to keep track of
how many picks that cell is a potential hypocenter for (how many "hollyhocks"
that cell is a member of). During the stack, binder only keeps track of the
grid cells within the initiating pick's stacking volume. At the beginning of a
stack, the hit count of every cell in the initiating pick's stacking volume is
set to initiating pick's stack weight. Then binder calculates the hollyhock
for each additional pick; for each grid cell that lies within the new pick's
hollyhock and the initiating pick's stacking volume, the hit count in
incremented by the stack weight of the new pick.
After binder sums the stack using all picks in its short list, it reviews
the hit counts of all the grid cells. Binder prepares a list of cells that
share the maximum hit value and the same minimum distance to the initiating
phase. Binder declares a new hypocenter from the stack only if the maximum
hit count is greater than or equal to an event threshold value (set with the
"thresh" command) and the number of cells in the list is less than an
event focus value (set with the "focus" command). The trial hypocenter for
the declared event is calculated by averaging the center positions of all
the cells in the maximum-hit-count list. Each pick that contributed to that
maximum hit count is associated with the new event.
It's very important that at least 4 phases contribute to declaration of a
newly stacked earthquake. If all picks are being weighted equally (no
"grid_wt" commands were used), the event threshold ("thresh" command)should
be set to 4 or higher. If differential stacking weights are being used, the
event threshold should be greater than 3 times the maximum stacking weight
specified by "grid_wt" commands.
Finding the optimum value for the event focus value ("focus" command)
may require considerable experimentation for any specific network. The
optimal value will depend somewhat on the stacking grid granularity ("dspace"
command). Since the trial hypocenter is calculated by averaging the center
positions of all the cells in the maximum-count list, a larger focus threshold
will increase the probability of a poor starting location (which has its own
attendant problems). A smaller focus value, however, will tend to reject
distant events.
Stacking and Eventids:
Every time binder identifies a new event with a stack, it assigns
an eventid to it. Binder uses this eventid in its internal bookkeeping
to know which picks are associated with which events. It also writes
the eventid in the TYPE_LINK and TYPE_QUAKE2K messages that it outputs,
so that downstream modules (eqproc, eqprelim) can also do their own
bookkeeping.
Binder keeps track of the next available eventid in a file called
"quake_id.d" that is written in the EW_PARAMS directory. The actual
command written to the file is the "next_id quakeid" command. This file
is read once on startup (actually, the first eventid used after startup
is quakeid+1), and is updated after every new event is stacked. To ensure
sequential, non-duplicated eventids between runs, binder's configuration
file must contain the command "@quake_id.d". If this command is omitted,
event sequence numbers will be issued beginning at 1 each time binder
is restarted.
You should never see the same eventid twice, but you may see gaps.
On startup, binder always skips one eventid (just in case it died in the
middle of writing the quake_id.d file), and it may kill events for various
reasons. Also, the eventids may appear downstream out of numerical order.
Sometimes a smaller event that occured after (thus had a higher eventid)
a large one will finish first.
Associating New Phases with Active Earthquakes
----------------------------------------------
Whenever binder retreives a new pick, the first thing it does is try
to associate the arrival time with one of its currently active earthquakes.
Even though the Earthworm pick_ew module supposedly only picks P-wave
arrivals, binder assumes the pick is an unknown phase during the association
process. Knowing each event's origin time and location (binder keeps track of
the most recent 100 hypocenters), the station location, and the velocity
structure, binder calculates the arrival times of each event's possible
phases (P, Pg, Pn, S, Sg, Sn) at that site. Then it compares the actual
arrival time to the calculated times to find the residuals. Any phase with
a residual within the configured residual-tolerance qualifies that pick to
be associated with that event as that phase. The phase having the lowest
residual value (within the residual-tolerance) wins! Binder associates
or links the pick with the event with the winning phase, then it hands that
event to the L1 locator for further processing. If no phase has a residual
within the residual-tolerance, binder performs a stack with the new pick as
the initiating pick (see stacking section above).
Association and Time Constraints:
As I mentioned above, binder keeps track of the 100 most recent hypocenters.
When it tries to associate a new pick, it doesn't actually consider all of
those hypocenters as possible sources for the pick. Instead, it considers
only those hypocenters where the pick time is within a given time range of
the origin time. The "t_dif tmin tmax" command defines the acceptable time
range (tmin, tmax are in seconds). If the pick time is earlier than
(origintime + tmin) or later than (origintime + tmax) for a given event,
binder will not attempt to associate that pick with that event.
Association and Distance Constraints:
Binder keeps track of the average epicentral distance (rAvg) of all phases
associated with each active hypocenter in its list. Before binder attempts
to associate a new pick with a given hypocenter, it compares the epicentral
distance of that station with a cutoff distance based on the rAvg value for
that hypocenter. If the new pick's epicentral distance is greater than the
cutoff distance, binder will not attempt to associate the new pick with
that hypocenter. The cutoff distance is calculated by rAvg*rAvg_Factor,
where rAvg_Factor is a constant (default value = 10.0) that can be set with
the "rAvg_Factor" command. This distance cutoff is important in cases where
two earthquakes occur with seconds of one another in different parts of the
network. It also keeps an event from being contaminated by distant spurious
noise picks which happen to match the event's travel time curve.
Configuring the Travel-time Residual Tolerance Function:
Whether or not a given phase is associated with, culled from, or scavenged
for a given event is determined by how closely its arrival time fits the
predicted arrival time for the event (ie, how large its travel-time residual
is). The overall maximum-allowed travel-time residual (residual tolerance
or restpr) for a pick to be associated with a given active hypocenter is
composed of three terms:
restpr = dist-taper + res1_OT + res2_OT
The first term (dist-taper) is based on the distance of the current pick to
the hypocenter (controlled with the "taper" commands) and two remaining
terms (res1_OT and res2_OT, controlled with the "taper_OT" command) are based
on the quality of the event's location. All of these terms are configurable
to allow you to widen the residual tolerance for more distant picks and/or
for more poorly located earthquakes.
Residual Tolerance Distance Term, dist-taper:
The distance-dependent term of residual tolerance, dist-taper, is controlled
with the "taper r resmax" command, where r is the epicentral distance (km) and
resmax is the value of the dist-taper term (seconds). Multiple taper commands
(up to 100) can be used, in order of increasing r, to define the dist-taper
term of the tolerance function. Between the distances specified, the value of
the dist-taper term changes linearly between corresponding resmax values. For
distances less than the lowest value specified, the dist-taper term is constant
at the resmax value for the smallest distance; for distances exceeding the
maximum specified, the dist-taper term is constant at the resmax value set
for the maximum distance. If no "taper" commands are supplied, the dist-taper
term is a constant 1.0 sec for all distances.
The dist-taper function defined by the "taper" commands also plays a role
in how a pick is weighted in the relocation process (see the distance-weight
factor in the "Relocation and Pick Weighting" section below).
Residual Tolerance Hypocentral Quality Terms, res1_OT and res2_OT:
The "taper_OT OTconst1 OTconst2" sets two constants that are used to
calculate the hypocenter-quality terms of the maximum-allowed travel-time
residual, restpr. The thinking is, if your location is poor, you'll want to
accept a phase with a higher travel-time residual. If no taper_OT command
is listed, the default values are OTconst1 = OTconst2 = 2.0. To eliminate
the two hypocentral quality terms from the residual tolerance function,
set both constants to 0.0.
The res1_OT term is based on event's rms residual and the number of phases
associated with the event. Binder uses the first constant, OTconst1, in
calculating res1_OT for each hypocenter as follows:
res1_OT = OTconst1 * rms * Npick_factor
where Npick_factor = 3.0 when npick <= 4
= SQRT( npick/(npick-4) ) when npick > 4
When the number of picks is small, the rms of the location is often quite
low, so the Npick_factor and OTconst1 will dominate the res1_OT term.
As the number of picks associated with the event increases, the
Npick_factor approaches 1, so the event rms and OTconst1 will dominate
the res1_OT term.
The res2_OT term is based on the hypocentral distance to the nearest
station and the second constant, OTconst2. Given the hypocentral (slant)
distance to the nearest station calculated by:
s = SQRT( z**2 + dmin**2 ) where z = hypocentral depth
dmin = epicentral distance to
nearest station
binder calculates the res2_OT tolerance term as follows:
res2_OT = 0.0 when s < 1.0
= OTconst2 * log10( s ) when s >= 1.0
The larger the minimum slant distance, the more unreliable the location,
and the larger this second residual taper term becomes.
OK, so that describes the form of all three terms in the residual tolerance
function, but it doesn't really give you a clue as to what values to use for
your network. I think it's most important to set a reasonable distance taper
for your network based on how well your velocity model fits your arrival-time
data. Hopefully, you already have some idea about that. The residual tolerance
can then be widened if necessary using the origin uncertainty terms. The
values of all three terms are printed in binder's logfile, so you may be
able to empirically settle on some parameters by studying those.
Binder's Simple Event Locator
-----------------------------
After binder declares a new event from a stack, or after is associates
a new pick with an active event, it passes the event to its simple L1 locator
for further processing. The locator takes a starting location and all the
phase data for the event, determines a weight factor for each phase in the
relocation, and performs a number of iterations to arrive at a new location.
Again, there are quite a few configurable parameters available to control the
locator, and I'll describe them below.
Intermittent Relocation:
While binder's locator is simple, it can still be quite a CPU hog, especially
when a large number of phases are associated with the event. Also, as the
number of associated phases grows, the hypocenter becomes more stable, and the
improvement in the location with each additional phase shrinks. Therefore,
we added the configuration command "locate_eq npick interval" to allow
intermittent relocation of an event. Instead of relocating after every single
new phase is associated, an earthquake with more than "npick" picks will
relocate with every additional "interval" picks associated. Up to 10 "locate_eq"
commands can be supplied (in order of increasing "npick" values) to set multiple
npick-break-points and location intervals. An event with fewer picks than
the smallest "npick" value will locate after each additional pick is associated.
If no "locate_eq" commands are used, every event will be relocated after
each addition pick is associated.
Relocation and Pick Weighting - Phase Type, Pick Quality, and Distance:
The overall weight of a pick in the location process is the product of three
weighting factors: a phase-weight, a pick-quality-weight and a distance-weight.
wt = phase-weight * pick-quality-weight * distance-weight
The phase-weight is based on the phase assigned to the pick by binder, in
conjuction with the "ph" commands listed in the configuration file. Use
one "ph phase weight" command to assign a "weight" (any number from 0.0 to
1.0, inclusive) for each "phase" (any of the strings P,S,Pg,Sg,Pn,Pg) that
you want binder to use in the location. If at least one "ph" command is
supplied, other phases that were not listed in a "ph" command receive a
phase-weight of 0.0 (ie, they won't be weighted in the location process).
If no "ph" commands are used, all phases receive a phase-weight of 1.0.
The pick-quality-weight is based on the pick-quality assigned by pick_ew,
in conjuction with the "wt" commands listed in binder's configuration file.
Use one "wt quality weight" command to assign a "weight" (any number from 0.0
to 1.0, inclusive) to any "quality" (0,1,2,3 or 4) of pick that you want
binder to use in the location. If at least one "wt" command is supplied,
other pick-qualities that were not listed in a "wt" command receive a pick-
quality-weight of 0.0 (ie, they won't be weighted in the location process).
If no "wt" commands are used, all quality picks receive a pick-quality-weight
of 1.0.
The distance-weight is based on the epicentral distance of the station
from the current starting location and the "r_max" and "taper" commands
listed in the configuration file. The "r_max" command is used to set the
maximum epicentral distance for weighting picks in the location process.
Any pick at a distance greater than r_max km from the starting epicenter
will not be used in relocating the event. For picks within r_max km of the
epicenter, the distance-weight is a function of the current distance, r, and
the distance-taper function set with the "taper" commands (see the "Residual
Tolerance Distance Term, dist-taper" section above):
distance-weight = ((min dist-taper value)/(dist-taper value at r))**2
The overall weight of a pick in the relocation process will be a number
between 0.0 and 1.0, inclusive.
The total number of phases binder can use in relocating an event is
limited using the "maxpix" command. The default value (256) should be
plenty for most networks.
Relocation and Iteration Control:
Binder hands its locator a starting location and a list of phases. The
locator determines the pick weights as above, then calculates the event's
average weighted travel-time residual (rms) to use as a reference for
testing divergence.
For most earthquakes, the first iteration performed by the locator will
be a "Free" solution. It will invert for all four hypocentral parameters:
latitude, longitude, depth, and origin time. However, if an event has no
nearby picks associated, the locator will perform "Fixed Depth" solution
instead. This feature is controlled with the "FixDepth dmin trialz"
command (default dmin=50.0, trialz=8.0). If an earthquake has no picks
closer than "dmin" km, binder sets its depth to "trialz" km, then inverts
for only latitude, longitude, and origin time. This prevents events with
no depth resolution from flailing.
The result of the locator's inversion is the "step-vector." This is the
amount by which the iteration will change all 4 hypocentral components.
Binder places limits on the horizontal (xystep) and vertical (zstep)
components of the step-vector in any given iteration. Those limits are set
with the "MaxStep xystep zstep" command (default xystep=10.0 km, zstep=2.0 km).
If either step-length is exceeded in an interation, all 4 dimensions
(x, y, z and time) of the step-vector are equally damped, preserving the
direction of the vector, such that each dimension is within its configured
limit.
Binder also place limits on the valid range for computed hypocentral depth.
This range, specified with the "zrange zmin zmax" command (default range
is 2.0 - 20.0 km), should span the depth range in which you expect to detect
earthquakes. It must also span the entire depth range of the stacking
grid ("grdz" command); otherwise, events may stack outside of the zrange
limit and this could cause locator logic errors. If during an iteration,
the step-vector causes the depth to fall outside of the zmin-zmax range,
that step vector is abandoned. The locator then "restarts" the iteration;
beginning with the previous starting hypocenter, the locator performs a
fixed-depth solution. This feature is useful in controlling locations
from network "glitches" where all picks are coincident and the apparent
high phase velocity would try to locate the quake at the center of the Earth.
At the end of each iteration, binder recalculates the event's average
weighted residual (rms) and performs a test for divergence. If the new
rms is greater than the previous rms*"constant" (set with "MaxDeltaRms"
command, default = 1.01), then the solution is diverging. If the location
is converging, the locator goes on to the next iteration. If a location
is diverging, binder removes half of the step-vector in all 4 dimensions
(x, y, z and time), backing the location up toward the starting location
from that iteration, and then checks the rms again. If the solution now
converges, the locator goes to the next iteration. If the solution is
still diverging, it backs the location up farther. Up to 3 backwards steps
will be made (resulting in a step-vector of 1/8 the original size) if
the solution continues to diverge.
Binder's locator will perform a maximum number of iterations every time
that it's called. The maximum iteration count can be set with the "MaxIter"
command (default = 4). However, the locator may perform fewer than MaxIter
iterations if the change in the location (in X-Y-Z km) in any given
iteration is less than a minimum step-length (set with the "MinXYZstep"
command; default = 0.1 km).
Post-Relocation Event Review:
Pick Culling, Pick Scavenging, and Killing Events:
--------------------------------------------------
Binder has just relocated an earthquake. Presumably, the location
is better contrained that it was before, and the arrival time fits
may have changed as well. Binder now recalculates the travel-time
residuals for all phases associated with the event, regardless of
whether the phase was weighted in the relocation or not. It also
calculates the event's average weighted residual (rms), and performs
a series of tests.
Pick Culling:
For each associated pick, binder compares the new travel-time
residual to the new residual tolerance value for that station (see
the "Configuring the Travel-time Residual Tolerance Function" section
above). If the new residual is larger than the tolerance, binder
deassociates or culls the phase from the earthquake. That pick is
now free to associate with another event, or to stack with other picks
to identify a new earthquake. If any picks are culled from an event,
that event is sent back to the locator once again.
Pick Scavenging:
After an event is relocated, binder also looks through its list
of recent picks to see if there are any others which may fit the
travel-time curves of the new hypocenter. Essentially, it does
the association process (comparing the travel-time residuals of
various phases to the residual tolerance at that station) on picks
that are in already its memory; binder is scavenging. Binder is
allowed to scavenge two classes of picks in its list: those that are
not associated with any active event (also known as "waifs") and those
that are associated with events that have fewer phases than the
scavenging event. So big events are allowed to steal phases from
smaller events, as long as the travel-time residuals fit. Since a pick
can only belong to one event at a time, this scavenging feature makes
it possible for two events to merge into one. This is useful in
avoiding "split events", where a large event is originally identified
as two events due to the order in which its arrivals were processed.
If an event successfully scavenges one or more new phases, the
event is sent back to the locator once again. Also, if any of the
scavenged phases had been associated with a second event, that
second event is also sent back to the locator.
Killing Events:
In reviewing events, binder sometimes decides that an event is
no longer valid, either because its location parameters are poor,
or because it doesn't have enough supporting phases. If after a
relocation, an event's rms is larger than a cutoff value (configured
with the "rmscut" command, default = 1.0 sec), binder marks that
event as destroyed or killed. Or, if after a round of scavenging,
an event has fewer than 4 weighted phases (too few to relocate the
event), binder kills that event as well.
When binder kills an event, it deassociates or unlinks all of
its phases. Those picks are then free to associate with other active
events as waifs, or to stack with other picks to identify new events.
To notify downstream modules that the event was killed, binder issues
a TYPE_QUAKE2K message for that eventid, setting the "number of
associated phases" field to zero.
Quality Control: Pick-group Assessment
---------------------------------------
Occassionally one of binder's newly stacked events will include an
outlying pick which falls at a "leverage point" for the hypocenter
produced by binder's locator. This bad pick has undue influence
over the hypocenter, causing a poor location which in turn leads
to the mis-association of other picks. Binder uses a test called
"pick-group assessment" to identify and eliminate these outlying
picks, and to produce a trial hypocenter for the locator. Given a
group of supposedly related picks, pick-group assessment tests whether
all of the arrivals are indeed consistent with a single earthquake.
Pick-group assessment is most efficient when it is performed on an
event with relatively few associated phases. The configuration
command "assess_pk minpk maxpk" controls when the test is run; when
an event has between minpk and maxpk (inclusive) picks associated with
it, its pick set will be assessed before the event is handed to the
locator (defaults are minpk=6, maxpk=12). If an event stacks with more
than maxpk picks, its pick set will be assessed at least once. To
turn off pick-assessing totally, set minpk and maxpk to zero.
Only P-wave arrivals are used in pick-group assessment. First, the
set of picks is sorted in increasing time order and is examined for
glitches. Glitch picks are eliminated from further testing (see the
"define_glitch" command described in the "Pre-Stack Glitch Filtering"
section above). The remainder of the test involves a number of
resampling trials. In each trial, four picks are chosen from the
pick set and the exact four-station hypocenter is calculated using a
uniform velocity halfspace (configured with the "v_halfspace" command,
default = 5.0 km/s). Next, binder calculates the travel-time residual
with respect to that hypocenter for every pick in the pick set (again
using the uniform velocity halfspace). You can specify the maximum
pick quality (assigned by pick_ew) to use in this resampling process
with the "maxwt" command (valid values are 0-4, default = 3). Picks
with qualities higher than maxwt (poorer quality) may be rejected as
glitches, but will not be used in resampling trials.
The number of trials used in the pick-group assessment is controlled
with the "maxtrial" command (default = 500). Given n picks sampled 4
at a time, the total number of unique combinations is n!/(4!*(n-4)!).
If this number is less than maxtrial, all possible combinations will be
used, with no duplication. If the total is greater than maxtrial, then
maxtrial random samples of 4 picks will be taken, and duplication is
possible.
Statistics on the residuals from all of the trials are used to
determine if each pick should be accepted or rejected. If a pick is
an outlier, it will have very large residuals for any of the hypocenters
determined from 4 other picks. Also, the scatter in its residuals
will be large. For each pick, binder calculates the median residual (med),
its absolute value (absmed), and the median of the absolute deviations
from the median residual (mad). If either the absmed or mad is greater
than a configured cutoff threshold, the pick is rejected as an outlier.
The cutoff values are configured with the "residual_cut maxabsmed maxmad"
command (default maxabsmed = 3.0 sec, maxmad = 10.0 sec).
Pick-group assessment is a multiple pass process. The first pass samples
from the earliest (up to 25) P-phase picks which meet the testing criteria;
this pass is used to winnow the data to arrivals with median residuals
below a cutoff value. A second pass resamples the winnowed data to
determine the new starting hypocenter from the average of all the 4-pick
hypocenters. This new starting hypocenter and the winnowed list of picks
are handed to binder's locator for further processing.
If any picks were rejected as glitches or outliers by the pick-group
assessment, they are deassociated from the event and are free to restack
or reassociate. Occassionally, binder rejects enough picks that the
event no longer has enough valid phases to relocate it. In this case,
the event is killed, and all of its picks are deassociated.
Examples from Binder's Log File
-------------------------------
1. Stack: five picks stack to create a new event.
2. Glitch Detection in Stacking Phase: a series of picks stack, until
the last pick identifies the group as a glitch.
3. Association: two examples: a pick associates with an active event and a
pick fails to associate or stack.
Trouble-Shooting
----------------
Binder's not producing any earthquakes - what could be wrong?
1. Is it seeing any picks?
Check the log file, you should see some lines like these:
10 6 3 2189 MWB NCVHZ D2 19990930005310.08 431 672 584
19990930_UTC_00:53:15 grid_stack, mhit = 0
The first line is a TYPE_PICK2K message, the second is the log of an
stacking attempt.
2. Is the velocity model specified with "lay depth velocity"?
3. Is the station file in the correct format?
All lats & lons should be positive in degrees & minutes,
with N,S,E,W denoted by letters. If there are no letters,
N and W are assumed.
4. Are the stacking grid boundaries set properly?
Lats & lons are in decimal degree with positive = N, E.
Module Index | Binder_ew Commands
Questions? Issues? Subscribe to the Earthworm Google Groups List.