Minutes of WFCAM Science Archive meeting: 5th September 2003 ------------------------------------------------------------ ------------------------------------------------------------ Present: ETWS, JMS, IAB, NCH, RGM Apologies: MAR, AL, HMG, PMW Actions discharged: ------------------- ACTION: NCH & IAB to test status/log capture using XMLRPC. (IAB has coded up generic threader modules under python; NCH and IAB have tested CU16 using this. Error bus communication (task status) and log information are captured and communicated via XMLRPC as a tuple to the curation client. This all works well and we intend to follow this model for implementation of CU tasks on the curation client / load server architecture). - all done and discharged (thanks to all concerned!) Actions partly discharged but continuing: ----------------------------------------- ACTION: ALL to review risk register and think of any new internal or external risks that should be documented in the risk register. - continuing; thanks for progressing these. Actions carried forward from 22/08/03 meeting: ---------------------------------------------- ACTION: ETWS to update TWiki networking pages with new test results. ACTION: PMW to liaise with AL to make sure that Compute Support are made aware, from the highest "official" level, of our requirements, re. 100 Gbyte/day transfer from CASU, and to impress on them that the 1 Gbit/s FW should be their highest priority. - CONTINUES Specific points and new actions: -------------------------------- Project management: NCH reported that he, PMW and JMS had attended the VDFS monthly telecon management meeting on Wed; WFAU progress on WSA was well received. The next monthly meeting will be a face-to-face (probably at ROE) and then there is a GSC oversite review pencilled in for the beginning of October. JPE will be at ROE next week and will attend next week's WFAU weekly meeting. Hardware: HMG & PMW reported (via NCH) that the first pixel file node for the WSA is now ordered (4 TB capacity based around SATA IDE 250 GB disks and 3Ware hardware RAID controllers). NCH reported experiencing problems with the disk arrays recently purchased from Eclipse. Unidentified errors starting with parity check warnings, through unrecoverable write errors and culminating in volume failure are being experienced intermittently. Eclipse are aware of the problem and are attending on the basis of the 8-hr daily call-out. Currently, one catalogue server is working fine, but the other has lost one 1.2 TB logical volume again (see SSA below). LTO-2 backup has been exercised, used and verified (for SSA intermediate file backup). A backup run of the 1.2 Tbyte intermediate files, employing hardware compression, resulted in 4 Ultrium-2 tapes being written in around 6 hours. Verification and a reread of a backup file were fine. The 1 Gb/s WSA LAN connectivity has been verified via file transfer tests between the two Windows catalogue servers. Burst transfer speed is as high as 800 Mb/s (80% specified bandwidth) while more realistic sustained rates for large files are between 300 to 500 Mb/s; clearly the W2003 Enterprise Server and Gb/s switch configuration is working well. Software: Further tests with XMLRPC (specifically log and status capture) have been successfully done by IAB & NCH. NCH suggest he and IAB should meet offline to discuss further progression of WSA curation software. ACTION: NCH & IAB to meet offline to discuss the way forward for implementation of specific tasks now that development of generic wrappers and modules is coming to a close. ETWS reported that the transmogrification of 2MASS files into intermediate CSVs for ingestion into SQL Server will be complete by the beginning of next week; ETWS & NCH have tested ingestion (and choice of default values where IPAC have nulls) and all is well so far. ETWS has implemented computation of HTMIDs and unit vectors for these files, obviating the need for costly computation within SQL Server itself. ETWS also reported working on the documentation schema parsing scripts given feedback from MAR and incorporation of this into the proto-SSA web documentation. Networking: Nothing new to report this week. SSA: NCH reported good and bad news...(see also hardware above). The good news is that intermediate file transfer is complete and the files backed up and verified. Initial loading has taken place on one server: the load speeds are superb - 10% of the SSA has been ingested into SQL Server on amenhotep, and the process took 2h40m (this equates to nearly half a billion records in 160 minutes, or 10 Mbytes/s ingest, & includes attachment of primary keys after loading). NCH has optimised the loading procedure and the correct T-SQL incantations are encpasulated in a load script in the CVS. RGM asked about the availability of SDSS data on the new hardware for exercising cross-neighbour tests. ACTION: NCH to communicate readiness of 10% SSA for indexing asap to RGM (pending diagnosis of disk problems on ahmose) ACTION: RGM to consult with MAR on exporting SkyServer EDR from grendel01 into a transferable file to set up on one of the load servers; and to enquire with JHU as to location of the DR1 "sneaker net" box. MAR reported production of prototype SSA web documentation available at URL http://grendel12.roe.ac.uk/~avo/ssa/index.html and asked all to look and comment. ACTION: EVERYONE to examine prototype documentation, and to send comments and suggestions to MAR. NCH suggested that we need an SQL cookbook tutorial based around the existing 20 queries; RGM volunteered to progress this ACTION: RGM to progress design of a web cookbook based around the 20 Queries. MAR has also asked for suggestions as to the URL name of the WFAU Science Archives (eg. the SSS server cosaxp6 is known as www-wfau.roe.ac.uk to the outside world). Presently we have had the following suggestions for the web server (thoth) URL: a) skycats.roe.ac.uk; b) dataserver.roe.ac.uk; c) datacentre.roe.ac.uk. More suggestions and/or votes for these are most welcome from anybody. NCH pointed our that skycats may not be a good choice (eg. ESO browser tool is known as SkyCAT etc.) Miscellaneous: ACTION: NCH to type up and circulate these minutes. DONM: 10am, Friday 12 September, plate library.