From nch@roe.ac.uk Sat Apr 17 15:25:52 2004 Date: Fri, 16 Apr 2004 11:48:54 +0100 (BST) From: Nigel Hambly To: WFCAM Science Archive Team -- Eckhard Sutorius , Harvey MacGillivray , Ian Bond , Mike Read , Nigel Hambly , Bob Mann Cc: CCs for WSA weekly meeting minutes distribution -- Clive Davenhall , Andrew Lawrence , Andy Adamson , Peter Shillan , John Taylor , Jim Emerson , Martin Hill , Mike Irwin , Peredur Williams Subject: WFAU WSA weekly meeting: 16th April 2004 Minutes of WFCAM Science Archive meeting: 16nd April 2004 ------------------------------------------------------------ ------------------------------------------------------------ Present: NCH, IAB, JDT, PMW Apologies: MAR, MCH, JPE, HMG, JMS, AL, ETWS, GPS, RGM DONM: 10am, Friday 23rd April 2004, plate library. Actions discharged: ------------------- ACTION: NCH, MAR, ETWS, RGM & IAB to have a source merging brainstorm to check over the CU8 procedure on Tues 6th April PM. Discharged: see Software. ACTION: NCH, RGM & ETWS to meet Friday 2nd April PM to have a final look pver the full SSA website before launch. Discharged. Actions partly discharged but continuing: ----------------------------------------- ACTION: ALL to review risk register and think of any new internal or external risks that should be documented in the risk register. Defered until top level plan is explicitly defined. PMW tabled the VISTA risk register as an example. ACTION: JPE to design and set up centralised VDFS web pages. Progressing. (there is now a link to it from our WSA Twiki). - continuing; thanks for progressing these. Actions carried forward from 26/03/04 meeting: ----------------------------------------------- None. Specific points and new actions: -------------------------------- Project management: Nothing new this week. Comments and issues arising from CASU fortnightly minutes: The team noted the minutes of the meeting of the 2nd April; no comments arose. Networking: NCH noted that during a lunchtime power cut three weeks ago, the WSA servers were all kept running by their respective UPSs. However, IAB pointed out that it is possible to lock-in the UPS with the servers (at least those running Windows) so that in the event the UPS runs out of juice, it warns the server to shutdown cleanly before the power runs out. NCH noted that JNTD had also pointed this out, and that this feature should be investigated and exploited if possible. NCH also noted that despite the continued operation of the servers during the power outage, the network switch to the SRIF connection / JANET backbone access router was, of course, off since it is not UPS protected. This meant that the servers were not accessible from the outside world. JNTD has pointed out that there is no reason why the SRIF switch (or the entire rack of ROE switches) cannot be UPS protected - the power consumption requirements are small compared to the large disk arrays. JNTD has suggested that he contact suppliers for a UPS spec and price; NCH suggested that WFAU can at least purchase a UPS for the SRIF switch to ensure high availability of the WSA during short power outages. NCH will liaise with JNTD over these enhancements to the archive system network availability. Hardware: NCH noted that ahmose continues to run fine at U160 speed. Meanwhile, Eclipse are doing their best to source an LSI Logic MegaRAID (TM) hardware RAID adaptor to benchmark against the existing SW RAID configuration. The idea is to check performance and stability at U320, and possibly upgrade both servers if results are promising. Unfortunately, the suppliers (who are having to import the adaptors from the USA) have let Eclipse down, so they have yet to take delivery of a card to test. Eclipse have also encountered some high-speed memory instability on other Tyan motherboards, and will investigate if this is a problem on our server currently in their labs. NCH has been promised weekly updates from here onwards as regards these developments. Software: The source merging brainstorming was reported as being most valuable by NCH; IAB & NCH are now in the thick of coding up and testing the resulting source merging procedure for database-driven source association. This work has resulted in some enhancements and simplification of the DB schemas which are being implemented by NCH. IAB reported: "The software tools for source merging has progressed as follows. The basic software unit is a set of code that takes two input file lists and produces two output lists of pair pointers. The code for list input and sorting by declination together with the python bindings have been written. The pairing algorithm is now being implemented." SSA: NCH reported that the SSA is now ready to be launched (4Tbyte of data incorporating SSS for Dec < +3.0 degrees, SDSS EDR & DR1, and 2MASS & USNOB all-sky). The only outstanding problem concerning the DR1 has been solved (thanks to Norbert Purger in Hungary) - it turned out that the DB files copied from the SneakerNet box have default OS file attribute "read-only" and this must be switched off in the General properties pane otherwise SQL Server is unhappy with the files and won't let the DB system admin account do anything to the DR1 database. The plan is to launch the SSA next week on MAR's return, with a splash to UKIDSS and Starlink News. PMW reported "stress testing" the SSA by simultaneous submission of 16 trawl-type (ie. unindexed queries) via the web form. In isolation, a trawl takes 15 mins (270 Gbyte scanned at 300 Mbyte/s); concurrently, the 16 queries took between 28 (best) and 116 (worst) mins. However the main point to note was that the the system could cope with this level of usage without falling over. Miscellaneous: Nothing else this week.