From nch@roe.ac.uk Sat Apr 19 10:14:11 2008 Date: Fri, 18 Apr 2008 16:38:29 +0100 (BST) From: Nigel Hambly To: WFCAM Science Archive Team -- Eckhard Sutorius , Mike Read , Mark Holliman , Nigel Hambly , Nicholas Cross , Rob Blake , Ross Collins Cc: CCs for WSA weekly meeting minutes distribution -- Andrew Lawrence , Andy Adamson , Brian Walshe , Jim Emerson , Malcolm Stewart , Lorenzo Rimoldini , Mike Irwin , Peredur Williams , Bob Mann , Stephen Warren Subject: WFAU VDFS Science Archive weekly project meeting minutes, 16/04/08 Minutes of WFAU VDFS Science Archive meeting: 18th April 2008 ------------------------------------------------------------------------- ------------------------------------------------------------------------- Present: NCH, ETWS, NJC, PMW, RSC, RPB, MAR, MSH, LGR Apologies: JPE, JMS, AL, BCW, RGM DONM: 10am, Friday 25th April 2008 in the Vista Hut Actions discharged this week: ----------------------------- o) Add in a default row for every detector appearing in every detection table (for schema consistency when querying merged sources and individual detections) - ACTION: RSC & NCH Discharged - hooray! ACTION: NCH & PMW to meet on Monday 7th at 10am to finalised the Q2 plan. Discharged Actions partly discharged but continuing: ----------------------------------------- ACTION: MSH to liaise with IT Support over set up of NFS file share from hatshepsut Continues; awaiting response from ITSG Actions carried forward from 01/04/08 meeting: ---------------------------------------------- None this week Specific points and new actions: -------------------------------- Project management: The team reviewed the Q2 plan assembled by PMW and NCH, and ratified it as sensible. NCH noted that an invitation has been received to give a talk at the RAS meeting on the 9th May in London; AL has kindly volunteered to do this. WFCAM & VISTA updates: RSC reported back from the VMC meeting in Hertfordshire, where the archive design for the variability datasets was particularly well received. RSC has put a debrief up on TWiki topic http://apache.roe.ac.uk/twiki/bin/view/WFAU/ConferenceMeetings NCH noted that Vista M1 is now coated and mounted; JPE communicated: "There are 3 (very similar as they were coordinated) press releases on M1 today - of which the ESO one has a set of new pictures (click on canera ikon on top left of that page). ESO http://www.eso.org/public/outreach/press-rel/pr-2008/pr-10-08.html STFC http://www.scitech.ac.uk/PMC/PRel/STFC/VISTA.aspx QMUL http://www.qmul.ac.uk/news/newsrelease.php?news_id=985 " Comments and issues arising from CASU fortnightly minutes: The team noted the minutes of the meeting of 32nd March 2008. There was some discussion as to likely reactions to a possible approach (we haven't had one yet) from IAC Tenerife to bulk transfer all UKIDSS images to the Canaries. Amongst the more printable ones were: a) Why? and b) What happens when we add new data (not to mention reprocessing/recalibrating old). Possibly more constructive responses might be a) Send slave+tapes and you're quite welcome, or b) send slave+JBOD and you're still welcome (but not quite so much). Thinking a bit harder about this, and in light of the time of year, maybe CASU were pulling our legs a wee bit... well done, you got us! Networking: NCH noted further email contacts with Andrey Belikov and Konrad Kuijken concerning interfacing between VST/Astrowise and VISTA/VSA for crossmatching between KIDS and VIKING. We're making steady progress... WSA Operations: RPB reported trying to restore a copy of BestDR5 to hatshepsut: "Turns out that the Backup Exec sql server backups will allow you to redirect the restore to another server, but only if you have the same drive layout as the server you backed up from. If you have a different drive layout, you can redirect to another drive, but you're only allowed to nominate one drive. If your database is larger than the drive, you're stuffed. The SQL Server 2005 management studio (or whatever it's called) will let you copy a database from one server to another, but it will create the destination database with the default collation of the server. If this is different to the collation of the database (as it is with BestDR5), the copy fails. It may be possible to do this by taking the database offline, but I haven't investigated this yet. I also tried creating an empty database on hatshepsut and using "export data" to move the data from thutmose. Sadly this failed when it ran out of system resources as the db is rather large (who knew?!). I've concluded the best strategy for disaster recovery purposes is to do a SQL Server backup onto disk (/disk21 on takelot) and then back that up onto tape. Restoring from a SQL server backup on disk will allow us to remap the drive layout upon restore." Hardware ETWS noted that he has copied data from khafre's disks to takelot and started updating the databases on the public servers in advance of letting ITSG loose on khafre again to see if they can sort out the 64bit Debian4.0 problem. RPB noted: "the raid controller serving drives G: and H: on catalogue load server Ahmose fell over last Friday evening. It seems that the raid controller failed for some unknown reason and helpfully trashed the raid configuration. As a result, all data was lost. Eclipse Computing supplied a new raid controller on Tuesday morning. After trying all avenues to recover the data, MSH and I have started initialising a new raid array (currently about 85% finished). Once this is done, I can start restoring the WSA db from backup tapes. When finished, this will be restored up to the state it was in on 29th March." ACTION: RPB to finish restoring Ahmose and get it back to a usable condition, hopefully by early next week. In a fit of paranoia, NCH wondered if this could be the result of deliberate sabotage. In any case, it was suggested that everything should be done to lock down and secure the systems. ACTION: RPB to ensure all unnecessary logins are disabled on archive servers. MAR noted that the WSA interface will be pointed to a metadata mirror DB on the public server in lieu of the ingest DB on the load server. RSC noted that the procedure to automatically synchronize this is progressing. RSC asked about the status of general filestore backups on the linux side. ACTION: RPB to convene a meeting of interested parties to discuss locations and importance of all data help on WFAU servers to organise a consistent backup and DR strategy. Software: RSC reported: "Mostly I've been updating all software for schema changes and testing against new schema. The main challenge is a rewrite to the bulk outgest code to support schema-driven outgest of joined table data, plus support to copy between servers to speed up copying of WSA data from load server to WFCAMPROPRIETY/OPENTIME / test databases on public server. Re-wrote some older code used by CU13/4 to remove bare-faced exception catches that will cause silent errors during testing. Final schema modifications were made to TestWSArecal following MAR's help in diagnosing the problems. Updated DbSession.dropColumn() so that it can now automatically drop default constraints. NJC and I went through the schema for the calibration tables and the new synoptic tables and fixed all of the foreign key constraint issues, and I applied the foreign key to PreviousMFDZP, which amazingly reported no inconsistencies. Further modifications to the DataFactory.Table class were made to support the new Programme table design for new schema tables. Updated SyncDb to select metadata by programme, enabling simple release of WFCAMOPENTIME database. Documented WFCAMOPENTIME release procedure in the CurationOverview TWiki page." ETWS reported: - Implemented the partitioned schema design in CU4/exnumeric. This includes also the implementation of adding a default row for every detector appearing in every detection table. - Updated schemas for usage with new CU4 design since input data for astrometric and photometric calculations is hosted in different tables. - Included the illuimination correction in CU4/exnumeric. - Included joined views in the schema creation of external DBs for Astrogrid. Regarding the illumination correction, there was some discussion as to how to test it (especially the sign!). Tests are to be done ingesting the same data with the corrections switched on and off; then between independent datasets that overlap; and finally a full statistical test in a survey dataset using the overlaps to check for a reduction in the robustly estimated RMS. NJC reported testing CU13/14 in preparation for DR4, and also noted that there is some remaining implemetation/testing to be done on the dither offset quality bit flagging. Survey Data Release: Nothing new this week. Non-survey Data Release: RSC and MAR that a world-readable release (WFCAMOPENTIME) will be created next week for semesters 05A and 05B, including PATT, UH and NOAO-Japan WFCAM data. Astrogrid deployment: MSH noted that VOSpace and the ROE AstrogGrid community will be deployed by days end today. All other WFAU related services are up and running, though STILTS needs to be upgraded. Miscellaneous: Early doors: Old Bell 6pm, as a belated welcome for RPB. ============================================================= Nigel Hambly Tel: +44-131-668-8234 Institute for Astronomy Fax: +44-131-668-8416 University of Edinburgh Email: nch@roe.ac.uk Royal Observatory Blackford Hill Edinburgh EH9 3HJ The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. =============================================================