From nch@roe.ac.uk Fri May 11 16:54:36 2012 Date: Fri, 11 May 2012 13:53:24 +0100 From: Nigel Hambly To: Eckhard Sutorius , Nicholas Cross , Rob Blake , Ross Collins , Mike Read , Clive Davenhall , Mark Holliman Cc: Stelios Voutsinas , Lorenzo Rimoldini , Jim Emerson , Keith Noddle , Mike Irwin , Norman Gray , Tom Shanks , Andy Lawrence at ROE , Tom Kerr , Bob Mann , Stephen Warren Subject: WFAU VDFS Science Archives project team meeting minutes, 11 May 2012 Present: MAR, MSH, RGM, ETWS, RSC, NJC, NCH Apologies: JPE, AL, KTN, STV, RPB, NG DoNM: 10am Friday May 25th 2012 in the VH (MAR in the chair) Actions discharged this time: ----------------------------- ACTION: RPB to roll out latest curation client / DB server share configuration for routine ops use. Discharged Actions partly discharged but continuing: ----------------------------------------- None this week Actions carried forward from 27/04/12 meeting --------------------------------------------- ACTION: RPB to set up WSA mirror/sync for flat file DBs using MS SQL replication. Continues. ACTION: ETWS, with RPB, to develop an automated helper script to keep release DB files links updated. Continues; this one still has low priority. ACTION: RPB to complete the LTO-5 tape backups of VVV and SSA DBs and SSS flat file datasets. Continues; RPB has installed the new interface card... Specific points and new actions: -------------------------------- Project management: Nothing of note this time. WFCAM, VISTA and VST updates: Nothing to report this time Comments and issues arising from CASU minutes: The team noted the minutes of the meeting of 30th April. Regarding the network transfer speeds, see below. ETWS noted the VST reprocessing badged with the same v0.9 label will require a bit of care at this end in retransfer/reingest. MAR noted the catalogue photometric calibration issues for VST data and that we assume the catalogue filenames will indicate the catalogue type so that only the standard corrected cats will be ingested at the WFAU end. Regarding the back-propagated av conf column in all of WFCAM data, WFAU will need to copy this up at some point. Finally, it was noted that the missing file issues are under investigation. Networking: ETWS noted that it is not known whether the network transfer speed has improved as no new bulk transfers have been initiated recently; however the copy of the recent WISE all-sky data release from IPAC may give us some additional clues. ACTION: ETWS to copy over the WISE all-sky DR flat files from IPAC WSA/VSA/OSA/GES Operations: ETWS reported: -- Finished updating the ingest code for the new file shares. -- Updated the browser parser code to parse the latest version of the GES database. -- Started making code modifications to ingest the VST ATLAS catalogues. -- Ingested the latest WFCAM catalogue data from Jan and Feb 2012. -- Started transfering the WISE allsky release. The size is 225 GB of bzip2-ed Catalog parts, which will be unzipped to 805GB of data. It will take ~5.5days to copy the data. NCH noted that some progress has been made with the Gaia-ESO Survey archive schemas and that a new ingest application needs to be designed to cope with the richly structured MEF format of the file products, albeit using the existing schema-driven infrastructure already in place for VDFS. Hardware and Systems: Nothing of note this time. Survey Data Release: NJC and NCH noted that the VVV public data release build stalled at the index creation stage earlier this week. ETWS killed the build, and NCH tidied up server side and set the stalled index build going with the same command but logged in locally on the DB server. It completed overnight in 15h28m22s, so the best theory is that SQL Server is simply not coping correctly with multiple heavy data modifications (i.e. 10 billion row scale) in the context of a single (implicit transaction) connection. NCH recommended that CU19 be modified to drop connections and reconnect after issuing each data modification command... ACTION: RSC to modify CU19 to drop and renew DB connections after each command issued in the release DB build. ACTION: ETWS to restart CU19 for the VVV in append mode following RSC's scalability mod ... and then we should monitor the situation over the weekend. MAR has been in communication with the VIKING PI over the next batch of deliveries to ESO-SAF, and will schedule a telecon for sometime next week to try to clarify at least some of the outstanding issues concerning that survey. UKIDSS DR10 preparations are progressing, with eyeball QC promised by the end of May for LAS and DXS (NCH claims the prize for getting the GCS done first). We're awaiting confirmation from the GPS PI as to whether DR9 will be skipped and the GPS releases are to be brought back into sync with the others for DR10. RGM asked about plans for UKIDSS final release and UHS, and it was suggested that we try to schedule some time at the upcoming Science with UKIDSS IV meeting to discuss these with the various stakeholders. ACTION: NCH to contact AL/SJW about possible conflabs concerning UKIDSSDR and UHS curation. Software: RGM asked whether we should be considering some radical solutions for the VVV scalability issues that we seem to be constantly up against. Some discussions ensued, and it was agreed that one obvious approach to try is switching to a column-oriented RDBMS. It just so happens (see last minutes) that a MonetDB developer is spending some time at WFAU over the summer (UK Border Agency permitting) and the suggestion is to try some scalability benchmarks on ramses9 with the VVV to see if general curation activities are much faster, at the same time investigating any potential snafus concerning the richness (or lack of) in the SQL API with respect to that provided by SQL Server. Non-survey Data Release: NJC and ETWS noted that a recent enquiry from a Korean PI of a WFCAM non-survey programme is being addressed as fast as possible, but one or two software issues are causing a few headaches. NJC is on the case, and this one will be discharged asap to keep our customers happy. Astrogrid deployment & Data Analysis services: MSH reported back from his visit to CDS Strasbourg, noting that Vizier have deployed a super-fast crossmatch service prototype. It was also noted that static copies of UKIDSS public release source tables are included in the service, including LAS DR5 (LAS/DXS/GCS DR8 and GPS DR6 have also been provided and copied across the channel). Miscellaneous: Nothing else this time. ============================================================= Nigel Hambly Tel: +44-131-668-8234 Institute for Astronomy Fax: +44-131-668-8416 School of Physics and Astronomy University of Edinburgh Email: nch@roe.ac.uk Royal Observatory Blackford Hill Edinburgh EH9 3HJ The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ============================================================= -- Scanned by iCritical.