From nch@roe.ac.uk Sun Jul 22 00:05:01 2007 Date: Fri, 20 Jul 2007 14:53:42 +0100 (BST) From: Nigel Hambly To: WFCAM Science Archive Team -- Eckhard Sutorius , Johann Bryant , Mike Read , Nigel Hambly , Nicholas Cross , Bob Mann , Ross Collins Cc: CCs for WSA weekly meeting minutes distribution -- Andrew Lawrence , Andy Adamson , Brian Walshe , John Taylor , Jim Emerson , Malcolm Stewart , Lorenzo Rimoldini , Mike Irwin , Mark Holliman , Peredur Williams , Stephen Warren Subject: VDFS Science Archive weekly project meeting minutes, 20th July 2007 Minutes of WFAU VDFS Science Archive meeting: 20th July 2007 ------------------------------------------------------------------------- ------------------------------------------------------------------------- Present: NCH, RSC, PMW, ETWS, MSH, LGR, RGM, MAR, AL Apologies: JDT, BCW, JPE, JMS, NJC, JB DONM: 10am, Friday 27th July 2007 in the Plate Library Actions discharged this week: ----------------------------- ACTION: MSH to put a TWiki note up concerning benchmarking the disk subsystems on the new catalogue server hatshepsut. Discharged: http://apache.roe.ac.uk/twiki/bin/view/WFAU/SasTesting ACTION: NCH to discuss contents and timing of DR3 with UKIDSS CSS SJW asap. Discharged; conflab scheduled for Monday next week. Actions partly discharged but continuing: ----------------------------------------- The following from last time partly done but continue: o) Add in a default row for every detector appearing in every detection table (for schema consistency when querying merged sources and individual detections) - ACTION: RSC & NCH Continues; will get sorted (honestly!) as part of the general ingest DB schema revamp to facilitate rapid generalised photometric/astrometric recalibration. Actions carried forward from 13/07/07 meeting: ---------------------------------------------- The following from last time continue: ACTION: NCH to discuss and review AstroWISE interfacing with MAR and JDT. ACTION: NCH to start some hard-nosed negotiations with the UKIDSS PI and CSS concerning the contents and timing of DR3. - CONTINUES; see Survey Release below ACTION: AL to check on the exact wording of the survey release policy with reference to world release of proprietary survey data. - CONTINUES Specific points and new actions: -------------------------------- Project management: NCH noted that this afternoon's VDMT is postponed until next Friday at 3pm. WFCAM & VISTA updates: Nothing new this week. Comments and issues arising from CASU fortnightly minutes: No new minutes as of 20/07/07 am. Networking: JB reported: "CU1 transfer continue steadily there having been a bunch of data released by CASU (the rest of 07A), we have now transfered about half of this data with the rest transfering as we speak. Have been in touch with IT Support and Peter Bunclark about new ideas to test the UKLight connection." WSA Operations: JB reported: "The phasing in of SVN has almost been completed. CU1 (approx. two weeks of 07A to finish), CU2 (one week behind CU1) and CU3 (one week behind CU1) have all been running this week with very few problems (a couple of files have been identified as potentially broken, investigations continue). Backups went as normal and more disk tidying has been happening prior to larger data moves to the new NAS systems. NCH noted that the GPS source table is offline this weekend for resetting of it's seaming flags; reseaming takes place from next wednesday. Hardware: JB reported: "Amenhotep needed a reboot due to running out of resources, same old problem. Tests on Hatshepsut by Mark continue and SQL 2005 has now been also been installed (and some ingest test files been identified) for testing SQL performance." NCH and MAR noted that volume H: on public server amenhotep is giving rather a lot of disk errors; MSH has swapped in a replacement disk for the worst offender but it may be that the RAID card is on the blink since several disks on the same controller are reporting errors. NCH also noted that the aircon in C1 is frosting up again (and it's not particularly hot outside...) Software: RSC reported SW infrastructure mods in the light of recent architectural discussions with the development team. ETWS reported: "Started testing the possible use of Pyro (Python remote objects) as a tool to simplify the parallelised CU 1 to 4 runs." MAR reported: "Worked on WSA "pretty picture" gallery using Python scripts to auto-generate thumbnail generation and page layout etc from a CVS checkout. Looked into providing getImage functionality as a remote server (HTTP GETS) so that it can be called from Gaia etc. This was prompted by a request from JAC to have something available that they can call from the OT but this has always been on the todo list. Looks like it can be implemented (have working demo) but security might be an issue and iterate on requirements. We're getting reasonably regular, mainly nonSurvey, requests for "wholesale" download of flat files. Investigated a couple of ways in which this could be implemented within the archive listing access, e.g. option to save results as wget shell script." A discussion ensued concerning ease of flat-file catalogue downloads with regard to making it too easy for survey users to snarl up the web server and network by downloading data in bulk first and thinking later. The consensus was that everything should be done to help non-survey users get at their data easily, but that a bit more thought should probably go into any interface mods for easier flat-file access to bulk survey data. NCH noted discussions this week with the GPS survey head, who has an MSc student beginning to work on a cluster-finding algorithm in the GPS source catalogue. This seems like an ideal analysis application for LGR to get his teeth into, so NCH has copied him into the email correspondence. Survey Data Release: NCH noted that following a conflab with SJW on Monday next week, we should be in a better position to sketch out a plan of initial work towards DR3 (e.g. QC1 and prioritising survey database preparation curation tasks). Non-survey Data Release: Nothing to report this week. Astrogrid deployment: ETWS noted that he and MAR have written an sql schema for 6df for parsing and creation of metadata doc for 6df access through DSA. MSH and ETWS raised the issue of hosting the SDSS DR6, from the point of view of UKIDSS cross-matches in subsequent data releases, as well as a general AG DSA resource. NCH noted that the extra "legacy" survey coverage in DR6 over DR5 was a small (~400 sq deg) area, so from that point of view the gain over DR5 (already deployed) is only incremental. However, the availability of the deeper SEGUE survey over an equatorial strip of the NGC would be very useful, and obviously it is in the interest of all if we can host SDSS-DR6. ETWS has emailed Ani Thakar at JHU to find out more info on the likely availability of catalogue database products for DR6 (e.g. availability of BestDR6/SegueDR6 databases split over many DB files etc.) RGM additionally asked about the possibility of GALEX catalogue database integration into the WSA, noting the following: "GALEX does both imaging and spectroscopy. The imaging is in two bands FUV (~1500A) and NUV (~2250 A) and is organised into a wedding cake of surveys, plus targetted observations from AOs. The surveys of interest to us are the All-sky Imaging Survey (AIS), which aims to cover ~80-90% of the sky, avoiding the Galactic plane, to m_AB=20.5, and the Medium Imaging Survey (MIS), which covers 10000 sq deq of SDSS to m_AB=23 (think these are the NUV limits, but not sure). The data are made public through MAST at STScI. There have been three data releases so far GR1, GR2 and GR3. GR2 seems to be a re-run of the GR1 areas, but GR3 supplements GR2 in area. The AIS coverage in GR2/GR3 is about 13,500 sq deg. The MAST website says that GR2 and GR3 are each about 1.5TB, but I think that is images+catalogues+spectra, so I assume that the catalogues are only a few hundred GB at the moment, rising to ~1 TB, if we assume that 1/3 of the data is in so far. The GALEX database is a clone of SDSS SkyServer, which would simply ingest here, but there's nothing on any of the websites I've found about making available the data in bulk, so I guess we'd have to contact MAST." ACTION: RGM to look through requirements docs to see if any users have mentioned GALEX. Finally, AL noted that he could never get the "save to MySpace" button on the WSA interface to work. MAR and MSH noted that there are some issues regarding the status of the workbench when doing this. AL also noted topcat web-start launching issues. NCH suggested he put details into an email to wsa-support so that these issues can be queued up and resolved as quickly as possible. Miscellaneous: JB reported completing the EU Consultation on e-Science Digital Repositories that RGM sent around: http://ec.europa.eu/yourvoice/ipm/forms/dispatch?form=eSciDR NCH asked everybody to complete this. ACTION: ALL to complete the EU questionnaire on e-Science digital archives.