Endurance Data - inclusion in database

Let's say ENDurance has Data (in CSVs) and Image data resources.

--Is the SONDE data is the file most representative of the data? I.e, is bonney-8.csv the file to serve through the database?
-----About the fields/columns of "bonney-8.csv", we need definitions and Units - this is the biggest blocker to publish data.

I got this from Google Docs:
0 - Station tag: this field is the station identifier for the current row (i.e. F6, E5, B12).
1 - UTM X: Sonde drop location in UTM coordinates
2 - UTM Y: Sonde drop location in UTM coordinates
3 - Depth: Current depth reading
4 - Pressure: Pressure reading (used to compute Depth field)
5 - Conductivity (Ask Maciek for additional info)
6 - Temperature (degree C)
7 - CDOM: (S/m)
8 - Chl-a:(μg/l)
9 - REDOX: (Ask Maciek for additional info)
10 - PAR: (μE/(cm2·sec))
11 - PH: (pH)
12 - Turbidity: (NTU)
13 - Date: Date of the sonde reading (dd-MMM-yyyy)
14 - Time: Time of the sonde reading (hh:mm:ss)
15 - scan ID: unique reading ID
16 - Salinity: (Ask Maciek for additional info)
17 - Density:(Ask Maciek for additional info)
18 - Dtemp/dz: Temperature derivative wrt depth
19 - Dcond/dz: Conductivity derivative wrt depth
20 - DDensity/dz: Density derivative wrt depth
21 - Mission year: either 2008 or 2009
22 - Drop tag: unique sonde drop tag. Useful since multiple sonde drops exist for the same station id. This tag is unique for each drop. Generated using station id + drop date.

I can get a lot of metadata from the www.evl.uic.edu/endurance/ Site, but will need to be revised by you.

I think I can use Sandro's github project page for the "dttools" to generata metadata..

I did some "discovery" on the folders, and annotated many questions in the google doc. If you think addressing all those questions is too much, and want to simplify what we offer (only SONDE and images) OK, otherwise, PLEASE help me understand what is each folder and file by adding into the google doc, and It'd be best if you tell me exactly what do we want to offer, and provide pointers to complete metadata.

===email again...

Update --(+2 yrs)

Endurance --- Email , CC PIs. Diane Mike Peter and Magic.
===========

Update:
THis is an email exchange I had with Maciek. Aprl 2013.

--------------------------------------------
OK, I hear you about your dissertation work - ive been there and I totally understand.

Let me ask you this -- If I prepare a minimum set of questions (cap them at 3 ) that will let me get out a minimum viable product out there, and that will require one hour of your time (give it or take), would that be an option before May? Like scheduling a google hangout whether is a hit or a miss on those two critical q's, or even email.

I am unsure whether I can produce that starter minimum product w/ questions, but If I can, please let me know you can or cannot be game.

Cheers, Inigo

On 4/4/2013 10:08 AM, Maciej Obryk wrote:
> HI Inigo,
>
> Thanks for re-energizing this!
>
> Yes, there are few things that I was supposed to address. I meant to do it long time ago but as usually life happens (mostly my PhD, haha). Realistically, I won't be able to start working on this until begining of May. I have a very urgent project that I'm working on now and it is consuming my time. And in two weeks, I'm going to Europe for couple weeks for my grandma's 90th birthday.
>
> I will mark in my calendar to start working on this at the end of first week of May. Expect some results by the mid-May.
> How does that sound?
>
> Thanks,
> -Maciek
>
> On 4/1/13 6:03 PM, Inigo San Gil wrote:
>>
>>
>> Hi Alessandro, Maciek
>>
>> Now that I have a server with space to host most of Endurance data, I'd like to run a few questions.
>>
>> In the spreadsheet we worked on
>>
>> Document I've shared UNM data transfer minutes
>> Message from meuvida@gmail.com:
>>
>> Hi All,
>>
>> Here are the minutes from our productive meeting today. The minutes are focused on the ENDURANCE data transfer only and do not include LTER data discussion.
>>
>> -Maciek
>>
>>
>> Click to open:
>>
>> UNM data transfer minutes
>>
>>
>> Google Docs makes it easy to create, store and share online documents, spreadsheets and presentations.
>> Logo for Google Docs
>>
>>
>> There is mention of descriptions to files I cannot find. I think you may have had plans to work on that, I cannot recall.
>> On the other hand, I started diving into the folders to find something I can work with, but I bump into issues fast, as
>> most files may not have headers or, if there are headers, those are not 100% correspondent to the data the files host.
>>
>> Can we re-energize this a bit?
>> thanks!
>> inigo

======================================================================
======================================================================

=======================================================
Endurance data was handed over, now we need to distill the data we want to showcase and make available for querying in the database.

For that, I just received further distilled data from Sandro and explanations. Need to create tables, etc.

Dataset Description
Sonde data
file: bonney-8.csv
Description
This file contains readings from the ENDURANCE science payload deployments, plus time and location information
Fields
Fields are listed in order of appearance on data rows, left to right

0 - Station tag: this field is the station identifier for the current row (i.e. F6, E5, B12).
1 - UTM X: Sonde drop location in UTM coordinates
2 - UTM Y: Sonde drop location in UTM coordinates
3 - Depth: Current depth reading
4 - Pressure: Pressure reading (used to compute Depth field)
5 - Conductivity (Ask Maciek for additional info)
6 - Temperature
7 - CDOM: (Ask Maciek for additional info)
8 - Chl-a:(Ask Maciek for additional info)
9 - REDOX: (Ask Maciek for additional info)
10 - PAR: (Ask Maciek for additional info)
11 - PH: (Ask Maciek for additional info)
12 - Turbidity: (Ask Maciek for additional info)
13 - Date: Date of the sonde reading (dd-MMM-yyyy)
14 - Time: Time of the sonde reading (hh:mm:ss)
15 - scan ID: unique reading ID
16 - Salinity: (Ask Maciek for additional info)
17 - Density:(Ask Maciek for additional info)
18 - Dtemp/dz: Temperature derivative wrt depth
19 - Dcond/dz: Conductivity derivative wrt depth
20 - DDensity/dz: Density derivative wrt depth
21 - Mission year: either 2008 or 2009
22 - Drop tag: unique sonde drop tag. Useful since multiple sonde drops exist for the same station id. This tag is unique for each drop. Generated using station id + drop date.

Image Data
file: StationImages.txt
Description
This file contains reference images for sonde drops (generated using 2009 data only)
The table is built as follows:

From the ENDURANCE telemetry we identify sonde drop times and positions using information stored in the AUV log messages (“Start profiling”)
The AUV positions are converted to UTM coordinates using melthole offset and navigation correction transforms provided for the 2009 mission.
The station ID is found by cross-referencing the AUV position with a station UTM coordinate table.
The images are found by cross-referencing the timestamp in the image file names with the ENDURANCE telemetry timestamps for each sonde drop.
For upward-looking camera data, we only collect images within a 10 second interval from the start profiling message.

Fields
Fields are listed in order of appearance on data rows, left to right

0 - Station tag: unique station identifier
1 - UTM X: utm coordinate of station
2 - UTM Y: utm coordinate of station
3 - Image type: either upward or sonde depending on the type of image listed in this row.
4 - Image path: path to an image file in the endurance 2009 dataset (all paths are relative to the dataset root)

NOTE: for a single (station, utmx, utmy), multiple rows exist, each one listing a single image.

UNM minutes (ENDURANCE data transfer) - 6/6/2012

Add ‘ENDURANCE’ link to the ‘Data’ tab on the MCM-LTER website. General format:
Data -> Endurance -> 1) query system for sonde data, 2) bathymetry, 3) Looking Glass visualization software, 4) upward and downward looking images

Details:

1) Query should resemble ‘Custom Query’ just like for the other LTER data for 1) consistency of the website and 2) there is no need to reinvent the wheel if we can use already existing tools. Perhaps we can call this subheading as ‘Sonde data’. The query site will contain drop down menus for:

Date Range (All, 2008/2009 season, 2009/2010 season)
Select Data Type (All, Conductivity, Temperature, and all the other variables based on the heading of the sonde data .cvs file)
Select Station (All, F6, F7…G2, G3…, etc)

I think this site should also contain an image of West Bonney with a mission grid so users would know location of each station. I might have that image, if not, I’m pretty sure that Peter has it

2) Bathymetry will be a downloadable XYZ .cvs file and perhaps a hi-res image with 2-meter contour interval

3) Looking Glass visualization software – downloadable application for PC only

4) Images – use images from ice-picking locations only. All images will be georeferenced and tagged on a Google Map. Clicking on a station will bring up thumbnail with all images from that station (one image for upward looking camera and all images from sonde (downward looking camera)). Click on image in the thumbnail to download it.

Additional tasks:

Alessandro – generate metadata for the Looking Glass and rename downward looking images with UTM coordinates for incorporation into the map. A lot of information about looking is already here: http://www.evl.uic.edu/endurance/lglass.html
Maciek – generate metadata for all ENDURANCE data (in eml format)
Maybe we should also incorporate a link to ENDURANCE website for more info if useres are interested (http://www.evl.uic.edu/endurance/endurance.html)
Inigo - can you provide us with a rough time estimate for incorporation of the data into the website?

About the data -- It would be cool to know what files do I need to work with from all the files in the folders below. That is, what do we need to expose.

While I am transfering data, I am looking at the contents. Here is what I see, feel free to complete or correct these observations.

We have two folders, corresponding to the 2008 and 2009 seasons.

Each folder-season has a lot of subfolders. Within the root directory, there are interesting files, such as logs and others. It would be good to know what those are.

For example, for 2009, I see these under the root folder
-----------------------------------------------------------------------

bonney_2009_day_log.txt --- a plain text log of dated events and circumstances.

Bonney2009_Mission_Stats.xls --- didnt look at it yet

profiling_summary.xls --- didnt open yet.

bonneyNAV.csv - Station,Northing (m),Easting (m) ,Elevation(m),Latitude,Longitude,Elevation

notes on this last file: elevation column seems repeated. we will need an explanation of the “station codes”. Lat and Long reported on deg, min and secs.

Under this main 2009 folder, there are Dive0* folders (Dive03, Dive04, etc). These may contain files and folders like these (from Dive19)

Files:
------
Dive19_24Nov09.tm.log.0 Dive19_day_log.txt -- a log
Dive19_showSensors_stop.png -- some image ?
Dive19_24Nov09.tm.log.1 -- a log
Dive19_showSensors_start.png -- some image ?

Folders
--------
deltat -- contains file(s) that are binary
images -- the images taken by the the forward, downward (sonde) and upward camera.
plans -- .pln files, seem like scripts to be executed by endurance for collection
seabird -- SeaBird file(s), seem like temp and conductivity in HEX format. Unsure about what is
derived_data -- ?
media -- photos, unsure what are these about. likely will not be used?
processes - lots of subdirectories for configurations, tests, logs and the like.
run - seems a bunch of python scripts and other shell scripts, perhaps to run the endeavour.

extracted_data -- These may be the files we need to post. The header list the labels of the variable. To add metadata, we need to define them, etc. Here are these.
batteryId - a numeric ID for the battery
error - code for an error. (put here list of possible codes and definitions)
warn - a numeric code for warning (put here list of possible codes and definitions)
voltage - the voltage recorded (units?) represents what?
current - the current (units?). represents what?
SoC - not sure. conductivity?
minTemp - minimum Temperature (units?)
minTempSensor - minimum Temp, how does it differ from previous?
maxTemp - max temp (units?)
maxTempSensor - difference with previous?
minCellVoltage - I suppose the min temp is derived from this?
minCell - ?
maxCellVoltage - ?
maxCell amphour -?
timestampv_B_B -- Unix timestamp?

Status: 

Priority: 

Normal