Model: complete koch dataset

potential questions about the summary spreadsheet, but for now, draft posted.

need input-zip file, rep. figure. some other details.

====
have emails, word docs, etc.

Hi Josh,

So very kind of you to write back with such precise info. I think Im good
with what you gives us here -- if I bump into something, I will write you again.

Here is a bit of where we are and why.

As you know, MCMLTER aims to give your data and models in general more
exposure. We have a mandate from our funding agency (NSF) to make the data
public, discoverable and standardized to some extent.It is our responsibility to
preserve the data that back studies for the long term. One of the good outcomes
is that by making your data publicly available we will likely increase the citations
of the associated publication (there is a study published by Pinowar that backs this).

We have constrains in practice as for make data and metadata public; the structure
of the metadata and data cannot deviate from other data structures published
on the mcmlter.org website. Also, we need to conform to the metadata
standards used at LTER (EML). Finally, I would like to show as much
info as you shared with Mike and me, just to make your data as re-usable
as possible.

For now, we have the bare minimums at
http://mcmlter.org/queries/modeling/model_home.jsp
(it is a work in progress, there are some dead links, etc)

For the mid-term evolution, I would like to integrate (loaded word) the work
with some good specialized model repository. A paper that just came out
http://www.sciencedirect.com/science/article/pii/S1364815213002703
shows a promising initiative. LTER did not put much attention to this yet.
We will probably try to get your input as for how any of these efforts may
help you.

Cheers, Inigo

On 1/15/14 3:27 PM, Koch, Joshua wrote:
> Hello Inigo and Mike,
>
> Thanks for the email. I don't think that I caught the entire discussion thread, but I see that you would at least like definitions and units for the column headings and maybe some text on how the melt, subsurface flow, and surface water routing models were connected? Please let me know if there are other needs.
>
> Definitions and Units (please let me know if any of this doesn't make sense...I think I'm looking at the file I sent Mike, but I'm not positive.)
> 'Time' (hrs): Is the simulation Time
> 'Q1', 'Q2', 'Q3', 'Q7', and 'Q8' are stream discharge in m3/hr for sites where stream data was collected (synoptic sampling locations A, B, C, D, and E). 'Q8' is also the stream gage location.
> 'As' is the amount of water lost to storage in the anabranches (in m3), defined as the difference between 'Q2' and 'Q7' multiplied by the 1 hr timestep. (negative numbers indicate recharge from the aquifer to the stream)
> 'Swradin' is the incoming solar radiation (in W/m2)
> 'Difference' is equal to 'As' (the difference between water upstream and downstream of the braided reach).
> 'Observed' is the observed discharge in m3/hr
>
> The next two are related to calculating the root mean squared error:
> 'Error' is the squared difference between the observed and simulated (Q8) gage discharge (m6/hr2)
> 'RMSE' is the root mean squared error, which is the square root of the mean error' value - this should just be one number (in m3/hr).
> 'kwr,g,qdiv storage' is the cumulative stored water volume for each hour in m3
> 'storage change ' is the percent change in storage volume between each subsequent time step.
>
>
> Now that you have the paper, I imagine these other questions are answered, but briefly:
> The relationship between incoming solar radiation and Canada Stream (an example of a stream with limited storage) was used to calculate the water discharge from the glacier/snowfield (ie. the input at the top of the modeling domain).
> SFR2 is publicly available for anyone to use, documentation can be found online at: https://store.usgs.gov/yimages/PDF/206917.pdf SFR routes water between the stream and aquifer based on darcian flow through a saturated medium. The user doesn't need to do anything to link the two, except tell MODFLOW to read the sfr2 input file.
>
>
> I hope this helps, please let me know if I can be of further assistance. Mike, please say hi to F6 for me! I've been talking with Karen C. about novel ways to locate and study subsurface preferential flowpaths....interested in putting together a proposal?
>
> ~Josh
>

Hi Mike, Josh

Mike, thanks for the clarifications and suggestions on how to split the spreadsheet -
I totally agree that it is not the intent here to educate a total newbie -- it is always hard
to strike the right balance: we dont want to offer publicly a data set that is too obfuscated
to be reused beyond the users of MODFLOW and SFR2, but on the other hand, it is
not our job to explain all details to a person who is a complete outsider (sucb as a
mathematician, or social scientist). I do not know the half way point is, that is for sure.

In practice it helps me ask you guys for the information LTER requires. However, as
discussed in other emails, there are no specific guidelines to modeling data documentation.
Still, the model output is bona fide data, and can be described just like we do for data gather
from a field experiment.

While we fine tune your needs and what would benefit you (and not burden you), I think
what we have now is great -- well, JOSH -- that and knowing what exactly are the column
headers of the summary file :).

A note about how modflow and sfr2 connect may not hurt either.

THANKS for writing from F6 & sending the paper!!

Inigo

On 1/10/2014 2:21 PM, Michael Gooseff wrote:
> Hi Josh,
>
> I hope the new year finds you well and happy.
>
> We are going through the process of uploading the modeling studies to our MCM LTER database and developing web pages to highlight modeling studies. Inigo San Gil is our database and IM wizard on this project and he is really the one who is keeping us in line and helping us translate our science to the database and beyond. Below he has some questions about the Huey Creek study and how some of the data is represented. Would you mind providing some insight on these issues? I just now sent him the paper (from F6, by the way...). Many thanks for your time on this. I don't think it will take long, we just need to download a little more of your knowledge!
>
> Best,
> Mike
>
>
>
> ---------- Forwarded message ----------
> From: Inigo San Gil
> Date: Fri, Jan 10, 2014 at 9:35 AM
> Subject: Re: Fwd: model metadata attached
> To: mgooseff@engr.colostate.edu
>
>
>
>
> Hi Mike,
>
> How are you?
>
> Im here perusing the documentation provided by Josh in his study of Huey's subsurface flow. I can see what motivated the study, and it is nice you can balance the water budget with these branching and return-on-floodwaters recession, it does make sense.
>
> The word doc Josh attaches answers many of the questions you sent Josh, and a lot of the info we need to make this a dataset that can be saved for the long term. There is the relative limitation of the non open-source status of the SFR2 software. However, the SFR2 documentation includes the theory behind the model (adaptation of SFR1 to unsaturated soils), is well balanced and informative. But it says nothing about the implementation of the model, I mean, one sees how would anyone connect the SFR1 to the MODFLOW, but at the end of the day, we need to document the actual parameters on the data input and specially outputs so they can be reusable.
>
> In summary, my real issue is about the data input and outputs -- I need more info to meet the LTER data standards.
>
> I localized all input and output files referenced in the "Koch_SFR.doc" documents in a folder named "input files".
>
> File sizes are not a burden for us, these datafiles, including the omitted head and budget data, we can handle need be. But that is the key, it may be sufficient to document the final output files well, perhaps even only the summary file. I am unsure because I am completely unfamiliar with the process to get to the final file (where you reconcile the water budget?). If we include all, we need to document the steps to get from one file to the next. There is a more important results summary file named "ResultsSummaryCalc.xls" which produces the [representative] graph, The graphs is actually embedded in it.
>
> This is my next and last issue. While it is very nice to have the source data and graphs tied dynamically in the spreadsheet, this presents a bit of a challenge for us -- we usually separate raw data from synthesis products. This way, we can put data in a relational database, and use different subsets of data and synthesis software to produce results. I am tempted about separating the graphs from the summary data, we need to work with a traditional CSV (comma delimited file). But this implies the buy in from Josh, and some guidance. What guidance, For example, in the header below, I see many columns that I can guess what they are, but not all, and not the details.
> Time (hrs) Q1 Q2 Q3 Q7 Q8 (gage) As
> Time (hrs) SWRADIN Q8
> Difference Observed error RMSE kwr,g,qdiv storage storage change (%)
>
>
> We need more precise details in order to enable better and easier reuse of these data. Sadly, I cannot get the paper from Wiley (http://mcmdev.lternet.edu/node/1901), where I would probably know where is Qi (i=1 to 8), I am assuming Q is discharge, which may be measured in cubic meters per hour (per abstract and word doc). I get that the Q8 is the discharge at the gauge :). I am insure what "As" column is, and the SWRADIN may be the short wave radiation, but not sure what the units are or where is measured (at the gauge or nearby met station? is it simulated? and why the radiation, anyway?). What is the difference, between the expected flowrate and the flowrates accounted when you add the returned flow from branches and or submerged water?
>
> We do not need all the info in detail, but we need enough so an educated person can actually interpret the data and reproduce the results or re purpose the data. It may suffice by explaining the variables and give some pointers to the process, and software, we are good with MODFLOW, but not sure with SFR2 (pr predecessor SFR1 which was deemed unusable for unsaturated soils).
>
> I also wonder how the freeze thaw cycles and diel flows are factored in - it seems like the antarctica streams are more complex than your hot and arid dessert streams in terms of water flows, but this is just my own curiosity.
>
> Sorry for the lengthy email, feel free to forward to Anchorage (Josh). Excellent study!
>
> Inigo
>

Status: 

Priority: 

Normal