From nch@roe.ac.uk Fri Apr 23 13:10:56 2004 Date: Fri, 23 Apr 2004 12:32:29 +0100 (BST) From: Nigel Hambly To: WFCAM Science Archive Team -- Eckhard Sutorius , Harvey MacGillivray , Ian Bond , Mike Read , Nigel Hambly , Bob Mann Cc: CCs for WSA weekly meeting minutes distribution -- Clive Davenhall , Andrew Lawrence , Andy Adamson , Peter Shillan , John Taylor , Jim Emerson , Martin Hill , Mike Irwin , Peredur Williams Subject: WFAU WSA weekly meeting minutes, 23rd April 2004 Minutes of WFCAM Science Archive meeting: 23rd April 2004 ------------------------------------------------------------ ------------------------------------------------------------ Present: NCH, IAB, JDT, PMW, MAR, JMS, ETWS, RGM Apologies: MCH, JPE, HMG, AL, GPS DONM: 10am, Friday 30th April 2004, plate library. Actions discharged: ------------------- None this week. Actions partly discharged but continuing: ----------------------------------------- ACTION: ALL to review risk register and think of any new internal or external risks that should be documented in the risk register. Defered until top level plan is explicitly defined. PMW tabled the VISTA risk register as an example. ACTION: JPE to design and set up centralised VDFS web pages. Progressing. (there is now a link to it from our WSA Twiki). - continuing; thanks for progressing these. Actions carried forward from 16/04/04 meeting: ----------------------------------------------- None. Specific points and new actions: -------------------------------- Project management: JMS has analysed the WFAU Q2 plan of work for "earned value" stats and these are now with PMW who is digesting them. These may be used to report at the monthly VDMT meetings, depending on how useful everybody feels they are. NCH noted that the inaugural VDFS User Committe (VDUC) meeting is to be held in Cambridge at the IoA on 7th May. NCH & PMW will be attending; NCH and MJI have been asked to give the usual pipeline and archive descriptions/updates (the next VDMT telecon is scheduled for earlier that week: 4th May). The VDUC consists of Tim Naylor (chair), Phil James, Phil Lucas, Steve Warren, Steve Maddox, Alastair Edge, Will Sutherland, Nic Walton. Comments and issues arising from CASU fortnightly minutes: No new minutes this week Networking: NCH noted that he has asked JNTD for costings for UPS protection of the SRIF network switch in C1; no response as yet. Hardware: NCH received (immediately after the meeting) an update from Eclipse (has now asked them to supply this update before Friday so as to be able to report at these meetings). The situation is that no manufacturers are currently shipping 4-channel U320, PCI-X RAID cards. LSI Logic have found a problem with their cards, and are investigating; there is as yet no idea when shipping may resume. In the meantime, Eclipse have obtained a 2-channel card and are benchmarking against existing "memspeed" measurements taken from the straight SW RAID U320 systems. NCH will liaise with Eclipse over this. Possible ways forward (if PCI-X/U320 is deemed too "bleeding edge") is to step back from the edge and go for 4-channel, U160/PCI (ie. 64-bit, 66 MHz * 75% sustained data transfer = 500 Mbyte/s bandwidth?) cards, but these of course have no PCI-X/U320 upgrade path. Eclipse are looking into all options, including clocking back faster cards (if/when available) as well as lower-spec, more stable technology. Software: IAB reported: "The pairing routines have been implemented as a python module and checked into the CVS under curation/wsatools/Pairing. They may be loaded using "import pairing". To pair a 335403 record list with a 246708 record list the performance specs were: 1 sec to read the lists and 1.1 sec for pairing and handshaking." NCH noted that the high-level scripting to wrap up the above, along with a small number of core utilities and Python SLALIB-style functions are being developed to implement CU7 (source merging). This work is progressing well. NCH & RGM reported extremely useful meetings with 3 representatives from the Canberra CSIRO Computing Group who have developed a very efficient source association algorithm that may be useful for much faster implementation of CU16 (catalogue cross-neighbour joins on tables of billions of rows) and also potentially on-the-fly catalogue cross matching in the science archive user interface. The Wizards of Oz have indicated that they are quite happy for us to have and use their code. Interestingly, they have tested two implementations: one one flat files in a linux dual Xeon system, the other on the same hardware but using Oracle - access directly using C++ methods distributed with the DBMS - as a backend data store. The team is taking a copy of the raw native binary SSA source files, and will be using them in benchmarking tests which will provide useful data to the WSA team. NCH had a rant concerning the state of the WSA software, and the level of discipline evident in the CVS repository code. A number of issues were discussed: 1) Use of CVS as a central store and development tool: there are still chunks of code completely absent from the repository. ALL science archive code MUST be checked into the repository. ACTION: MAR to check in the SSA web interface Java servlet stuff and all associated code in an appropriate structure underneath WFAU/SSA/src 2) ReadMes for configuration/usage/compilation etc. Please follow ETWS's example, and put verbose ReadMe files where appropriate to aid in the deployment of repository software (eg. building/compilation/installation etc). JMS also pointed out that unit testing scripts can be very helpful in testing code after alterations have been made ... eg. see CU16 test script (but note that this CU needs updating in the light of changes elsewhere!) 3) Compartmentalisation problems: NCH noted that there are a number of areas where compartmentalisation needs reviewing. Curation client-side / DBMS server-side stuff, eg. system wide constants and generally persistent objects / set-ups etc. are repeated and isolated in a number of places (eg. curation object classes defined on the curation side, and not linked in any way to the same information stored within the DBMS). Persistent objects are most naturally stored in the DBMS, so it makes sense to tie in curation-client object class definition with data stored in the DBMS... but some flexibility in this, where appropriate, is OK. ACTION: NCH & IAB to sort out the current compartmentalisation problems in the WSA software. 4) Please conform to the verbose header norm for all scripts/code in the repository. This helps enormously in tracking changes, and identifying bugs etc. when both developing and deploying code. Whenever a piece of code is being edited, if it doesn't have the rich header at the top, PUT ONE IN. ACTION: ALL DEVELOPERS to please conform to the existing WSA norm of a rich source code/script header that includes expanded CVS tags and a verbose revision history of that particular file. 5) Documentation: ETWS noted that there is a Python tool (similar to Doxygen) that can be utilised for generating a documented summary of modules/functions across the entire system etc. ACTION: ETWS & IAB to examine the utility of the Python code summary utility and to inform the team (via a TWiki note) as to how to format code description/comments appropriately. ACTION: MAR to look at Java code documenting/summarising facilities and to implement useage of the same in science archive java application code within the CVS. NCH noted that although in the short term, adhering to the above may make a bit of extra work, in the longer term it will pay off in producing easily maintained code. NCH emphasised very strongly that we must step up a gear in terms of coding discipline. NCH reminded everyone that he had sent around an email detailing some of these issues before Easter, and that it is now very important that we step up a gear in terms of coding discipline. SSA: NCH & MAR have made a small number of final tweaks to the SSA web pages; NCH has sent around a "splash" notice for everyone's perusal. In the absence of any further comments on either, the full SSA will be switched on this afternoon, and the splash notice sent to UKIDSS and Starlink News. Miscellaneous: Nothing else this week.