CONCIERGE

CONTENTS

OVERVIEW
WARNINGS
ORA_TRACE_REQ
ORA_TRACE_FETCH
DETERMINING RETRIEVAL TIMES / SCHEDULING
INSTALLING CONCIERGE
CONFIGURING CONCIERGE
KNOWN ISSUES
BLAME


OVERVIEW

HISTORY
PURPOSE
OUTPUTS
SUB-TOPICS

HISTORY

Concierge is the name given to a project involving the archiving of waveform data to the Earthworm Database. Concierge is the latest waveform archiving scheme, and is the latest evolution of the XXX_trace_save family of modules.

There have been many XXX_trace_save modules. The first incarnation of an XXX_trace_save module related to Oracle was db_trace_save written by Lynn Dietz.

Then came ora_trace_save, a mutated version of db_trace_save possibly written by Alex B.

Then came ora_trace_save2, a mutated version of ora_trace_save modified by David K. ora_trace_save2 was later renamed ora_trace_save.

Now, their is finally Concierge. Concierge represents the ultimate evolution of XXX_trace_save, (atleast until the next one comes along).

The goal of all of the XXX_trace_save modules has been to select and archive(into some medium) waveform data that appears to be of interest. The XXX_trace_save modules have worked primarily off of Earthworm Trigger messages. The trigger message is a simple chunk of data that indicates that some sort of eventish thing occurred, and as a result someone wants to save data for certain channels for certain periods of time.

The db related XXX_trace_save modules have focused on archiving the trace data from a wave_server into an Earthworm Database.

PURPOSE

The latest incarnation of XXX_trace_save for the db is called Concierge. Concierge is the name given to a new method of waveform archiving. The point of Concierge is to break up the task of "archiving selected waveforms" into its two logical subtasks: 1) the selecting of waveforms to archive, and 2) the actual archiving of the selected waveforms to the DB. All previous versions of XXX_trace_save had coupled the two subtasks into a single module. This single module ends up being an imperfect retriever, because it can make only one attempt at the retrieval, and is thus subject to incomplete data retrievals due to large snippets, networking problems, and wave_server errors.

By subdividing the the main task into two, concierge is able to make multiple attempts at archiving the data, and can provide incremental results that improve over time.

OUTPUTS

There are three tangible outputs of the Concierge project:
  1. ora_trace_req - a program that turns EW trigger messages into snippet requests that it records in the database.
    (see
    ora_trace_req documentation for more info)

  2. ora_trace_fetch - a program that reads snippet requests from the database and attempts to retrieve waveform data from wave_servers and archive the retrieved data into the database.
    (see ora_trace_fetch documentation for more info)

  3. EWDB_WaveformRequest API - This is a collection of EWDB_API library code used by ora_trace_req, ora_trace_fetch, and potentialy other programs to read/write/update snippet requests in the database. Application developers wanting to request snippets via the Concierge system can utilize this API.
    (see EWDB Waveform Request API documentation for more info)

SUB-TOPICS

There are three sub-topics of interest when trying to understand how the Concierge works, and how it can be used. Those three topics are: 1)How you specify snippets to request; 2)How the snippet requests get fulfilled; and 3)When the snippet requests get fulfilled.
  1. How you specify snippets to request (ora_trace_req)
    Literally, snippet requests are specified by calling an EWDB API function that records a request in the database. For most folks that aren't looking to write their own specialty programs every time they want to do something, the default solution is to use ora_trace_req. ora_trace_req is an Earthworm module that converts Earthworm trigger messages(from a ring) to snippet requests, and stores those snippet requests in the database. This leaves you with requirement of converting your "desired snippets" list into one or more Earthworm trigger messages. This can be done with an editor, or you can teach automated programs to write trigger messages. (See Earthworm Trigger Message documentation for more info.) If you generate a trigger message by hand, then you can insert it into an Earthworm Ring using the "file2ring" utility. Carltrig and HypoInverse/arc2trigII already generate trigger messages.


  2. How the snippet requests get fulfilled (ora_trace_fetch)
    Snippet requests are fulfilled by an Earthworm module (ora_trace_fetch) that periodically retrieves the requests from an EW Database. ora_trace_fetch periodically queries the given database for a list of requests. For each request, it queries wave servers for data that matches the request, and then stores the retrieved data in the database. Lastly it updates/deletes the request based upon the results of it's data retrieval.


  3. When the snippet requests get fulfilled(Scheduling/Retrieval)
    The fulfillment of the snippet requests is a 3-layer ordeal:
    See the detailed section on Scheduling/Retrieval for more information on scheduling and retrieval times.


WARNINGS

WARNING!!! As many ora_trace_req and ora_trace_fetch modules as desired may be run.
Note: Currently if the same trigger message is submitted to multiple ora_trace_req modules writing to the same database, then multiple copies of each request will be generated, and thus multiple copies of the same snippet will be generated.



MODULES

ORA_TRACE_REQ

OVERVIEW
ora_trace_req is an Earthworm module that converts Earthworm trigger messages to waveform snippet requests. The program utilizes Earthworm library code to read trigger messages off of an Earthworm Ring and parse them. It then uses EWDB API code to store the contents of the trigger messages in an EW Database.

DETAILS
ora_trace_req utilizes Earthworm "transport layer" library code to read (TYPE_TRIGLIST2K) messages from an Earthworm ring. It then uses "trigger parsing" Earthworm library code (parse_trig) to parse the trigger messages. It obtains the message author and the Event ID. It then verifies/creates an Event in the EW Database, associated with the trigger's Author/EventID combination. (For more information on Authors, please see documentation on EW Trigger Messages and Authors).

ora_trace_req then parses each request line in the trigger message, again via the parse_trig library and converts each line to a snippet request. It then inserts each snippet request into the EW Database via the Concierge (Waveform Request) EWDB API library code.

All inserted snippet_requests are flagged, so that when the actual snippet is retrieved(by ora_trace_fetch), that snippet will be associated with the trigger's DBEvent.

ora_trace_req records that a snippet is requested, and can optionally associate it with a Request Group, however ora_trace_req does not schedule the request for retrieval. This is done by the retrieval module itself. (Since requests generated by ora_trace_req are new, they default to being scheduled for immediate retrieval.)

Note: ora_trace_req formerly controlled the scheduling of the snippet request for retrieval. Scheduling is now handeled exclusively within the retrieval program (ora_trace_fetch), which supports multiple and potentially programable retrieval strategies.

ORA_TRACE_FETCH

Overview
Details
Partial Success
Retrieval Modes

OVERVIEW
ora_trace_fetch is the back half of Concierge. It is an Earthworm module that reads snippet requests from the database, attempts to retrieve the requested snippets from wave servers, and then archive the snippets into an EW database.

DETAILS
ora_trace_fetch is an Earthworm module, that attempts to fulfill snippet requests. It reads snippet requests from the database, attempts to retrieve waveforms from wave servers, and then archives the waveforms into the database.
ora_trace_fetch utilizes Concierge (EWDB Waveform Request) API code to read snippet requests from an EW Database. It then utilizes Earthworm wave_server client (ws_clientIII) routines to attempt to retrieve data from one or more wave servers(wave_serverV). It then records the results of its attempt via EWDB Concierge API code and EWDB Waveform API code.

ora_trace_fetch now controls the scheduling of retrieval attempts for snippet requests. ora_trace_fetch contains a source file(schedule.c) that provides potentially multiple scheduling algorithms. An algorithm is chosen via the config file(RetrievalSchedulingMethod command).

There is some trickery about how often ora_trace_fetch reads the table of requests, and when/how often it attempts to fulfill each request.

PARTIAL SUCCESS
The beauty of ora_trace_fetch and Concierge when compared to the previous XXX_trace_saves for the DBMS, is that the requested snippets can be processed multiple times, and snippets can be built incrementally. When ora_trace_fetch experiences a partial success in the fulfillment of a request, it writes the waveform data that it WAS able to obtain to the database as a Snippet, and then sets a link from the request to that Snippet. It then adjusts the request, so that the request now contains only that portion of the original request that was not obtained previously.

Example ora_trace_fetch retrieves a snippet request from the database, asking for data for AAA,EHZ,XX for the time range (0 - 100).
ora_trace_fetch attempts to retrieve the requested data. It obtains data from 0 - 20. It creates a snippet(123) in the EW DB for AAA,EHZ,XX (0 - 20). It then updates the request so that it is: (AAA,EHZ,XX : 20 - 100, w/Snippet 123). ora_trace_fetch reschedules the request for processing at a later time.

At a later time, ora_trace_fetch retrieves the snippet request from the database again. Now the request asks for data for AAA,EHZ,XX for the time range (20 - 100), and indicates that there is an existing snippet(123).
ora_trace_fetch obtains data from 20 - 45 for the channel from a wave_server. ora_trace_fetch retrieves the existing snippet(123) from the database, and updates it so that it contains dat for (AAA,EHZ,XX (0 - 45)). ora_trace_fetch also updates the request so that it is: (AAA,EHZ,XX : 45 - 100, w/Snippet 123). ora_trace_fetch reschedules the request for processing at a later time.

At a later time, ora_trace_fetch retrieves the snippet request from the database again. Now the request asks for data for AAA,EHZ,XX for the time range (45 - 100), and indicates that there is an existing snippet(123).
ora_trace_fetch is unable to obtain new data for this request(maybe the network was down).
ora_trace_fetch reschedules the request for processing at a later time.

At a later time, ora_trace_fetch retrieves the snippet request from the database again. Now the request asks for data for AAA,EHZ,XX for the time range (45 - 100), and indicates that there is an existing snippet(123).
ora_trace_fetch obtains data from 45 - 100 for the channel from a wave_server. ora_trace_fetch retrieves the existing snippet(123) from the database, and updates it so that it contains data for (AAA,EHZ,XX (0 - 100)). ora_trace_fetch deletes the request because it has been completely fulfilled.
(END OF STORY)

Overriding the Scheduling Algorithm
This is no longer applicable as all scheduling is done with ora_trace_fetch. Check the config file and find the algorithm that best suits your needs. If none work for you, write your own!

RETRIEVAL MODES

ora_trace_fetch mainly follows the retrieval schedule set forth by its scheduling algorithm. However, like all good software(sure....) there are some exceptions.
ora_trace_fetch can be run in three modes:
  1. STANDARD_DATA_MODE
    The program runs as efficiently as possible in terms of CPU and DB resources. The program will only check the list of snippet_requests in the database when it thinks it is time to process one. It figures out when there will be a request based upon it's knowledge of what is already in the database.
  2. ALWAYS_CHECK_LIST
    The program will always check the list of snippet_requests in the database, whether or not it expects to find any scheduled for retrieval. This mode is DB license intensive, but is not very CPU or DB intensive, and is no more wave_server intensive than the standard mode.
  3. ALWAYS_GET_SNIPPETS
    The program will attempt to process all snippet_requests in the database, whether they are scheduled for processing or not. This mode is CPU intensive, DB intensive, and wave_server intensive.

These three modes describe how ora_trace_fetch behaves each time it runs through the main processing loop. When we mention that in one mode, the program will always check the list of requests, or always process any existing requests, we mean that each time through the main processing loop, the program will behave in the described way.

By running in the third mode, you can force ora_trace_fetch to constantly process snippet requests. This can be useful if you are requesting large snippets, that contain both P & S arrivals from shaking emanating on the other side of the globe. In such a case, you will want data early-on so that P arrivals can be viewed, but you also want the snippet to be incrementally improved, so that later arriving waves can also be viewed and processed as soon as they are available.

For more detailed information on Modes, see the Program Modes section in the Programmer Notes.

In addition to the various modes of operation, the behavior of ora_trace_fetch is altered by the reception of a trigger message. Whenever ora_trace_fetch receives a TYPE_TRIGLIST2K message, it will immediately check the request list again to see if there are any requests to be processed. This behavior modification only has an affect when the program is running in STANDARD_DATA_MODE, because the other two modes already do this during each cycle of the main loop.

DETERMINING RETRIEVAL TIMES


SCHEDULING SNIPPET REQUESTS

All snippet requests are scheduled by ora_trace_fetch for retrieval. This can only be overriden by running ora_trace_fetch in a different
retrieval modes.

There is currently (1) scheduling algorithm provided by ora_trace_fetch. It is described below.

NOTE:Despite the impressions given in the following examples, all scheduling algorithms, do not schedule tNextAttempt. The schedule tDeltaNextAttempt, that is they calculate the time between now and when the next attempt should take place, and they pass that value to the EWDB Waveform Request API, which calculates the current time and adds the tDeltaNextAttempt to the current time to produce the actual tNextAttempt. This prevents algorithms from scheduling processing in the pass.

SCHEDULE EXPONENTIAL

SCHEDULE EXPONENTIAL is the default scheduling mechanism, and as of this writing, the only one that exists. SCHEDULE EXPONENTIAL utilizes an exponentially increasing time scale for attempting processing of snippet requests.

By default, SCHEDULE EXPONENTIAL is setup to attempt retrieval 11 times over a 7 day period. The timing of the retrieval attempts is approximately as follows:
Approximate time of attempts (after generation of request):
0(m)inutes, 10m, 30m, 1(h)our, 2.5h, 5h, 10h, 21h, 2(d)ays, 3.5d, 7d

There is a caveat(atleast one). If the request results in a partial success, SCHEDULE EXPONENTIAL will record the partial success and then furlough the request attempt for a few minutes(10), and then try again, without having recorded that the attempt was made.

The Details
When a snippet request is first handled by SCHEDULE EXPONENTIAL, it is assigned a set of scheduling parameters that are used by it's scheduling algorithm to calculate an exponentially increasing time series.
The current algorith has 5 variables.

The algorithm for calculating the next attempt-time is:
  tNextAttempt = tCurrentTime + tAttemptInterval * 
(dAttemptMultiple **
(iNumAttempts - iNumRemainingAttempts - 1)
)

The process of calculating the next attempt time is as follows: Example:

REQUEST CREATION
A request for ABC, EHZ, XX is created by ora_trace_req. iNumAttempts is set to 0. Because this is the creation of the request, tNextAttempt is set to 0. (Note that the current time is 1000000000.

RETRIEVAL ATTEMPTS



INSTALLING CONCIERGE

Database

Concierge requires an Earthworm Database version > 6.1. If you are upgrading your Earthworm Database from Version 6.1 or prior, you will need to run some SQL scripts, in order to add the Concierge functionality to your database. You should run the upgrade scripts for version 6.2 or greater. See documentation for upgrading your Earthworm Database. If you are starting with a post v6.1 Earthworm Database, then support for Concierge should be built in.

Executables

Concierge consists of two executables:
ora_trace_req, and ora_trace_fetch. These executables are designed to run as earthworm modules.

CONFIGURING CONCIERGE

There are many ways that you can configure Concierge. Once the database is installed and the executables are avaiable, configuration is choosing which executables will run where.

ora_trace_req

In general you want one copy of ora_trace_req running each place you have a unique trigger feed. For most installations this means running one copy of ora_trace_req attached to a "trigger ring" on an earthworm where all of you trigger messages eventually end up.
As of this time, if you run two copies of ora_trace_req that see overlapping/duplicate sets of triggers, then you will end up with overlapping/duplicate sets of requests and thus overlapping/duplicate sets of snippets in the database.

ora_trace_fetch

There are many options for how to configure ora_trace_fetch. ora_trace_fetch does not have to run attached to the same earthworm as ora_trace_req. They communicate only through the database. In general ora_trace_fetch only attaches to an earthworm ring so that it can obtain status reporting and restarting functionality through the Earthworm mechanisms. You can run ora_trace_fetch as a stand alone executable by linking it with the transport_dk.c library in the ora_trace_fetch directory in lieu of the standard transport.c library in libsrc.
NOTE: If you configure ora_trace_fetch to run in Mode 1 (STANDARD_DATA_MODE), then you will probably want to hook it up to an Earthworm ring that has a "trigger" feed. In STANDARD_DATA_MODE, ora_trace_fetch relies on trigger messages to wake it up when there is new activity.

Simple
The simplest way to configure ora_trace_fetch is to run one copy. Either run ora_trace_fetch in STANDARD_DATA_MODE and attach it to a ring where it will get Trigger Messages, or run it anywhere in ALWAYS_CHECK_LIST mode. Ensure that it has a large MaxTraces buffer for holding snippet requests, and let it run.

OTHER CONFIGURATIONS
If you are processing a lot of snippet requests, all of your wave_servers are not local or homogeneous in response time, requests are of verying priority, you want redundancy, you have CPU power to kill, or you have other special needs, you may decide that the Simple configuration is not for you. At which point you have a plethora of configuration options. COMPETITIVE vs. COMPLEMENTARY

When running multiple copies of ora_trace_fetch, the copies can be run in one of two manners: Competitive or Complementary. Competitive means that two copies of ora_trace_fetch are seeing the same data and competing to process the same requests. Complementary means that you are subdividing the request in such a manner that the two copies of ora_trace_fetch are seeing separate subsets of the master set of requests. The two never see a mutual request.

Dualing Banjos(COMPETITIVE)
If you've got CPU power to burn, a lot of requests to handle, you want redundancy, or you just like a brute force approach, then setting up competitive copies of ora_trace_fetch is for you. Ensure that all Competing copies of ora_trace_fetch have access to the same wave_servers, otherwise one wave_server might have the data you want but the ora_trace_fetch that picked up the request doesn't know about the wave_server so you don't get the data. When running competing copies of ora_trace_fetch, they should be configured with small MaxTrace buffers for storing requests. That way the requests spend most of there time in the database, where they can be picked up by the first available executable. The competing ora_trace_fetch's do not need to run on the same system. They do not talk to each other at all, and all competition issues are handled by a locking mechanism within the Database.

Barbershop Quartet(COMPLEMENTARY)
If you've got wave_servers with varying response speeds, or requests of vary priorities, you can group requests and then run complimentary copies of ora_trace_fetch. This method involves using RequestGroups. You can assign a request to a request group, either within ora_trace_req or within ora_trace_fetch, and then run multiple complementary copies of ora_trace_fetch that each operate on a different request group. This allows you to separate requests by wave_server or by priority. You might have requests for local data and requests for remote wave_servers with bad network connections. You would like to get the remote data, but you don't want it and the quality of the network between you and the remote wave_server to affect the retrieval of your local data. You configure the requests such that the local ones go into one group and the remotes go into another.
Now for the bad news: This method isn't well tested and is not designed to operate out of the box. The only EXISTING method for setting the RequestGroup of a request is via the ora_trace_req config file. To use this effectively you would have to presort your trigger messages, so that trigger messages for local data went to one copy of ora_trace_req and trigger messages for remote data went to another copy. A better alternative is to write a new scheduling strategy for ora_trace_fetch that examines a request and assigns it to a RequestGroup. (That however would require you to WRITE a strategy to use in place of SCHEDULE EXPONENTIAL). See programmer_notes if you are interested.

East meets West(COMPETITIVE & COMPLEMENTARY)
There is nothing in the competitive/complementary methodologies that prevents you from mixing the two. You might have needs that require you to run complimentary fetchers, but at the same time you want redundancy. Fine, just run a set of competitive ora_trace_fetch's in lieu of one for processing a particular request group. It's all good. Knock yourself silly. This is where the many permutations actually come from. By combining COMPETITIVE & COMPLEMENTARY mechansims you can build yourself an army of fetchers


KNOWN ISSUES

FROZEN LOCKS
A locking mechanism within the DB handles locking of snippet requests, so that two copies of ora_trace_fetch cannot attempt to process the same request at the same time. For the most part(broad generalization) the locking mechanism works well; however, if a copy of ora_trace_fetch hangs/dies/core_dumps before releasing the locks it holds, then the locks stay permanently locked. Clearing old locks is as simple as executing a sql_script, however this is officialy frowned upon, and a simple program that clears old locks is in the works.
THERE IS NOW A STAND-ALONE EXECUTABLE "unlocker" that fixes this problem. Please see its documentation.

DUPLICATE SNIPPETS
There is currently no mechanism that attempts to prevent duplicate snippet requests or duplicate snippets. If a request gets generated three times and everything goes well, you will end up with three duplicate snippets in the database.

BLAME

ORIGINAL BLAME
DK 022502

MOST RECENT BLAME
Last Updated by DK 2002/05/15