Julius Hrivnac Weblog: New Tag Extract Architecture

Monday, July 26, 2010

User demands an extraction job either via ELSSI or CLI. She can already have an XML JobConfig file or a jcid reference to an existing file. She can also construct a new configuration (via ELSSI).
ELSSI or CLI uploads JobConfig XML file to Athenaeum Manager. It gets corresponding jcid back.
Athenaeum managers stores JobConfig XML file in its database.
ELSSI or CLI asks for an execution job using jcid reference.
Athenaeum Manager gets JobConfig XML file from the database, converts it to Python command and CollAppend XML ArgList using XSLTs.
Athenaeum Manager asks Athenaeum PRUN Worker to execute created Python command.
Athenaeum PRUN Worker spawns a parallel subprocess(es) to perform the execution.
When the job finishes, it calls Athenaeum Manager to announce the result.
Athenaum Manager sends an email to user with informations about the run and links to Web Pages with results. (User can look at the job status already duringt the run.)

There two identifiers involved:

jcid identifies the JobConfiguration XML file, stored in the Athenaeum Manager database
pid identifies the actual extraction job and its results

Running and finished jobs are visible on the Jobs page. Existing JobConfig files can be inspected on the JobConfig-List page. All Athenaeum management is available from its main (development) page.

Julius Hrivnac Weblog