Diatoms Data

====
it'd be good to have some EML about that, but seriously...

===update May 10, 2014===============

The site should remain as is, but somehow, this data needs to be represented at the
mcmlter site and linked to.

Ideas are still below, but we probably need to move to some practical resolution.

==================================
Preserve the unity of the site - the relational nature of the data.

THink the MIcrobial Biomass. Now, each record spawns a COUNTS for species. Thats a template like

http://huey.colorado.edu/diatoms/samples/taxa_count_data.php?acc_NUM=900...

That count data needs to be somehow referenced from there. Needs to be downloadable. HOW?

Scrape it dynamically could be a way to go?

Once we have a model, lets do metadata.

2014.06.09
Looking at different ways to access data. Most likely going to use a csv format for presence/absence analysis.

2014.06.10
Wrote python script to concat all csvs into one csv
Started to scrape diatom data from webpage, SLOW GOING!
No way to directly download the data -> copy and paste into spreadsheet -> transpose data -> save as csv repeat a lot

2014.06.16
Wrote python code to scrape diatom data from webpage. Currently trying to sort out removing html tags and cleaning the data.

Status: 

Priority: 

Normal