DEIMS: The Data Explorer - workflow to copy data from orable into mysql mcm_data

==Jun 2015====

There is a script at gar:MCM-DATA that makes a .loadinfile file based on headers of the file. This is the fastest method to import in mysql. there is also an .insert file, with individual SQL insert statements, this is created from an export "Use Toad to export tables as SQL w/ CREATE table and INSERT statements". But this is slow for tables that are medium to big size.

"LoadINfile" by default appends data, so it works for new seasons.

Unsure about constrains -- in mysql should be easy to define uniqueness.

To finish this story, it would be great to structure it and add it to the MCM IM guide book.

==Jan 14 (2015) =====

The first two passes at the oracle-to-mysql are completed. A handful of things may remain, namely:
* make sure what was ported was well ported. (we'll hear soon)
* views and materialized views instead of "raw" tables. some datasets source on views which may not have been ported.
* a few tables were left out for further examination, see in gar files "no_structure", "not_needed", "empty" and "problems".

once these things are all checked, we can close this epic task.
a perl script was saved into the deims-mcm-custom github repo. see issues in queue for possible refinements of such script.

==Jan 8==
Chosen workflow.

Oracle : Use Toad to export tables as SQL w/ CREATE table and INSERT statements

Perl : Use a small Perl script to convert SQL files into MySQL compatible files.

Use "source" in a mysql shell to import data.

This conversion is OK for batches of data containing typically up to 5,000 data-rows.
The method results on an insert op that takes on average 0.05s/row. What about
long data-files, i,e; anything over 20,000 lines? MCM has about 200 tables with
such volumes (see list below). For those, use LOAD DATA LOCAL INFILE, which does
not do all lock-and-load ops per insert the method above does. You can slurp up to
500,000 lines of data in about 10 seconds or less.

To make good use of the LOAD methods in Ubuntu, you need to call the mysql shell
with the --local-infile flag.

Also, the dates will have to be converted using STR_TO_DATE and
the string '%m/%d/%Y %H:%i:%s', which works for most MCM dates. In the mysql
inserts, this would look like
STR_TO_DATE('12/01/1990 05:30:00', '%m/%d/%Y %H:%i:%s')
but in the LOAD, this is done through a variable.

SET DATE_TIME=STR_TO_DATE(@var, '%m/%d/%Y %H:%i:%s')

But a whole statement may be like this
LOAD DATA LOCAL INFILE './VIAM_RH.csv'
INTO TABLE VIAM_RH
FIELDS TERMINATED BY ','
IGNORE 27 LINES
(
DATASET_CODE, METLOCID,@var, RH,RH_COMMENTS
)
SET DATE_TIME= STR_TO_DATE(@var,'%m/%d/%Y %H:%i:%s')
;

Status: Most "CREATE" table work (75% or more). Most "INSERT" statements work.

Going alphabetically, progress noted in files.

Duplicate Tables:
alca_insert.mysql = alca_new_insert.mysql
comglsndpt_old_insert.mysql
comsnwdns_old_insert.mysql
glstkloc_copy_insert.mysql
limno_species_abundance2_insert.mysql
onyx_lwright_hydro_rev_insert.mysql
onyx_vnda_hydro_rev_insert.mysql

Not Needed Tables:
ancillary_projects
bibliography_insert.mysql
bibliography_theses_insert.mysql
bib_temp_insert.mysql
chmbr_test_locs_insert.mysql
chmbr_test_nutr_insert.mysql
dataset_toc_insert.mysql
dr$bib_snap_search$i_insert.mysql
eml_usage_insert.mysql
general_table_toc_insert.mysql
gis_toc_insert.mysql
glacier_core_toc_insert.mysql
glacier_short_toc_insert.mysql
limno_ion_doran_special_insert.mysql
limno_location_codes_insert.mysql
limno_location_names_insert.mysql
limno_run_coding_insert.mysql
limno_run_coding_unwanted_insert.mysql
limno_short_toc_insert.mysql
lost_seal_insert
master_toc_search_insert.mysql
mcm_soil_sites_insert.mysql
met_core_toc_insert.mysql
metlocs_insert.mysql
met_station_toc_insert.mysql
news_blog_insert.mysql
news_insert.mysql
outreach_events_insert.mysql
photos_insert
relict_locs_insert
sde_logfile_data
sde_logfiles
slter_data_insert
soils_core_toc_insert
soils_short_toc_insert.mysql
stream_comments_codes_insert.mysql
stream_core_toc_insert.mysql
stream_gauge_locations_core_insert.mysql
stream_runs_insert.mysql
stream_scanned_field_notes_insert.mysql
stream_short_toc_insert.mysql
stream_views_toc_insert.mysql
strm_dscrptns_insert.mysql
strm_gage_locs_insert.mysql
strmrun_insert.mysql
strmtrns_insert.mysql
toad_plan_table_insert.mysql
toc_list_insert.mysql
tracer_locs_insert.mysql

Failed to create Tables:
air_trans_wo_insert
also_insert.mysql
benm_rh_insert
etah_insert
etso_insert
featrutm_insert.mysql
glacier_chem_dp02_insert.mysql
glacier_istp_insert.mysql
gsca_insert.mysql
howglstkht_insert.mysql
limno_ice_thickness_insert.mysql
limno_par_integrated_insert.mysql
limno_piez_depth_insert.mysql
mrphtype_insert.mysql
prso_insert.mysql
relcbims_insert.mysql
relcnutr_insert.mysql
relicinv_insert
rsca_insert
soils_bee_insert.mysql
soils_biota_distribution_insert.mysql
soils_bundt_pan_insert.mysql
soils_elevation_trn_insert.mysql
soils_ltm_algae_insert.mysql
soils_poly_anhydro_insert.mysql
soils_polygon_insert.mysql
soils_se_co2_insert.mysql
soils_snowfence_insert.mysql
soils_upperponds_insert.mysql
spso_insert.mysql
stso_insert.mysql
suesavglstk_insert.mysql
tayglstkht_insert.mysql
trcrnutr_insert.mysql

DONE Tables:
adams_hydro_insert.mysql
afdmrlc_insert.mysql
AIKEN_HYDRO
alca_insert.mysql
algaerlc_insert.mysql
alwo_insert.mysql
ancillary_dowling_lakes_insert.mysql
ANCILLARY_SNYDER_HALOCHEM
ANCILLARY_SNYDER_IODINESED
ancillary_snyder_salts_insert.mysql
andrsn_hydro_insert.mysql
atso
atwo
barteast_hydro
bartflume_hydro
bartwest_hydro
benm_airt
benm_radn
beph
beso
bewo
canavglstk_insert.mysql
canada_hydro.mysql
canglsndpt_insert.mysql
canglstkht_insert.mysql
canmschng_insert.mysql
cansnwdns_insert.mysql
clark_hydro_insert.mysql
comavglstk_insert.mysql
comglsndpt_insert.mysql
comglstkht_insert.mysql
common_hydro_insert.mysql
commschng_insert.mysql
comsnwdns_insert.mysql
comweq_insert.mysql
crescent_hydro_insert.mysql
dian_insert.mysql
dica_insert.mysql
diso_insert.mysql
diwo_insert.mysql
dpah_insert.mysql
dpca_insert.mysql
dpso_insert.mysql
duan_insert.mysql
duso_insert.mysql
duwo_insert.mysql
etca_insert.mysql
etch_insert.mysql
etwo_insert.mysql
field_meter_dschrg_insert.mysql
garwood_hydro_insert.mysql
glacier_chem_9397
glacier_chem_dp01
glstkloc
gsso
gswo
howavglstk_insert.mysql
howglsndpt_insert.mysql
howmschng_insert.mysql
howsnwdns_insert.mysql
howweq_insert.mysql
hughavglstk_insert.mysql
hughglsndpt_insert.mysql
hughglstkht_insert.mysql
hughmschng_insert.mysql
hughweq_insert.mysql
invrtbrt_insert.mysql
limno_bact_prod_insert.mysql
limno_bath_hypso_func_insert.mysql
limno_bath_polynomials_insert.mysql
limno_bath_values_digitized_insert.mysql
limno_bath_values_insert.mysql
limno_chl_a_hoare_benthic_insert.mysql
limno_ciliate_grazing_insert.mysql
limno_do_temp_insert.mysql
limno_flagellate_grazing_insert.mysql
limno_lake_descriptions_insert.mysql
limno_lake_levels_insert.mysql
limno_measure_location_insert.mysql
limno_ph_insert.mysql
limno_phyto_density_winter_insert.mysql
limno_phyto_presence_sum_win_insert.mysql
limno_pore_water_special_insert.mysql
limno_ppr_insert.mysql
limno_respiration_ets_insert.mysql
limno_species_abundance_insert.mysql
limno_stable_isotope_insert.mysql
limno_winkler_insert.mysql
limno_zoo_1994_insert.mysql
ltca_insert
ltso_insert
ltwo_insert.mysql
pgah_insert.mysql
pgca_insert.mysql
pgso_insert.mysql
pgwo_insert.mysql
relcchem_insert.mysql
relcprpr_insert
rsan_insert
rsso_insert
rswo_insert
sdwo_insert
sean_insert
seec_insert
seso_insert
sewo_insert
sfca_insert
sfec_insert
sfmb_insert
sfph_insert
sfso_insert
sfwo_insert
soils_se_chem_insert
span_insert.mysql
spca_insert.mysql
spec_insert.mysql
spwo_insert.mysql
stca_insert
stream_availability
stream_chem_insert
stream_dic_insert.mysql
stream_doc_raw_insert.mysql
stream_tn_insert.mysql
strmafdm_insert
strmbims_insert.mysql
strmgeochem_insert.mysql
strmistp_insert.mysql
strmnutr_insert.mysql
suesglsndpt_insert.mysql
suesglstkht_insert.mysql
suesmschng_insert.mysql
suesweq_insert.mysql
swan_insert.mysql
swec_insert.mysql
swso_insert.mysql
swwo_insert.mysql
tayavglstk_insert.mysql
tayglsndpt_insert.mysql
taymschng_insert.mysql
taysnwdns_insert.mysql
tayweq_insert.mysql
trcr_avg_dschrg_insert.mysql
trcrchem_insert.mysql
trcrdschrg_insert.mysql
trnsrelp_insert.mysql
upper_vguerard_hydro_insert.mysql
victoria_hydro_insert.mysql
victoria_lkout_hydro_insert.mysql

"Too Long" (may need load data infile)
benm_soilt_insert.mysql
benm_wind_insert.mysql
beph_insert.mysql
beso_insert.mysql
bewo_insert.mysql
boym_airt_insert.mysql
boym_fline_insert.mysql
boym_ppt_insert.mysql
boym_radn_insert.mysql
boym_rh_insert.mysql
boym_snowht_insert.mysql
boym_soilt_insert.mysql
boym_wind_insert.mysql
boym_wvapd_insert.mysql
boys_airpar_insert.mysql
boys_airt_insert.mysql
boys_soilpar_insert.mysql
boys_soilsurfchange_insert.mysql
brhm_airt_insert.mysql
brhm_fline_insert.mysql
brhm_radn_insert.mysql
brhm_rh_insert.mysql
brhm_soilt_insert.mysql
brhm_surfchange_insert.mysql
brhm_wind_insert.mysql
brhm_wvapd_insert.mysql
caam_airt3asp_insert.mysql
caam_airt_insert.mysql
caam_fline_insert.mysql
caam_icet_insert.mysql
caam_pc_insert.mysql
caam_pressta_insert.mysql
caam_radn_insert.mysql
caam_rh_insert.mysql
caam_surftemp_insert.mysql
caam_wind_insert.mysql
cohm_airt_insert.mysql
cohm_fline_insert.mysql
cohm_icesurfchange_insert.mysql
cohm_icet_insert.mysql
cohm_radn_insert.mysql
cohm_rh_insert.mysql
cohm_surftemp_insert.mysql
cohm_wind_insert.mysql
cohm_wvapd_insert.mysql
common_hydro_insert
delta_hydro_insert
dpwo_insert
exem_airt_insert.mysql
exem_fline_insert.mysql
exem_ppt_insert.mysql
exem_radn_insert.mysql
exem_rh_insert.mysql
exem_soilt_insert.mysql
exem_wind_insert.mysql
exsm_pc_insert.mysql
flmm_airt_insert.mysql
flmm_pressta_insert.mysql
flmm_rh_insert.mysql
flmm_wind_insert.mysql
frlm_airt_insert.mysql
frlm_all_monthly_insert.mysql
frlm_blkr_insert.mysql
frlm_fline_insert.mysql
frlm_radn_insert.mysql
frlm_rh2_insert.mysql
frlm_rh_insert.mysql
frlm_soilt_insert.mysql
frlm_surf_insert.mysql
frlm_wind_insert.mysql
frlm_wvapd_insert.mysql
frls_airt_insert.mysql
frls_par_insert.mysql
frls_snowht_insert.mysql
frls_soilt_insert.mysql
gadm_airt_insert.mysql
gadm_pressta_insert.mysql
gadm_radn_insert.mysql
gadm_rh_insert.mysql
gadm_soilt_insert.mysql
gadm_wind_insert.mysql
gafm_airt_insert.mysql
gafm_radn_insert.mysql
gafm_rh_insert.mysql
gafm_soilt_insert.mysql
gafm_surfchange_insert.mysql
gafm_surftemp_insert.mysql
gafm_wind_insert.mysql
green_hydro
harnish_f11_hydro_insert.mysql
harnish_f7_hydro_insert.mysql
hodm_airt_insert.mysql
hodm_fline_insert.mysql
hodm_icesurfchange_insert.mysql
hodm_icet_insert.mysql
hodm_radn_insert.mysql
hodm_rh_insert.mysql
hodm_wind_insert.mysql
hodm_wvapd_insert.mysql
hoem_airt_insert.mysql
hoem_fline_insert.mysql
hoem_ppt_insert.mysql
hoem_pressta_insert.mysql
hoem_radn_insert.mysql
hoem_rh_insert.mysql
hoem_snowfall_mm_insert.mysql
hoem_soilt_insert.mysql
hoem_wind_insert.mysql
house_hydro_insert.mysql
huey_hydro_insert.mysql
lawson_hydro_copy_insert.mysql
lawson_hydro_insert.mysql
limno_ancillary_adcp_insert.mysql
limno_blue_box_insert.mysql
limno_chemistry_raw_insert.mysql
limno_chl_a_depth_insert.mysql
limno_ctd_insert.mysql
limno_dic_insert.mysql
limno_doc_raw_insert.mysql
limno_nutrients_insert.mysql
limno_par_instant_insert.mysql
limno_par_profile_insert.mysql
limno_phyto_density_summer_insert.mysql
limno_poc_pon_raw_insert.mysql
lostseal_hydro_insert.mysql
ltm_logger_temp_insert.mysql
ltwo_insert.mysql
lyons_hydro_insert.mysql
mism_airt
mism_pressta
mism_radn
mism_rh
mism_soilt
mism_wind
onyx_lwright_hydro_insert.mysql
onyx_vnda_hydro_insert.mysql
pgwo_insert.mysql
priscu_hydro
santafe_hydro
soils_ltm_soilt_hourly_insert.mysql
soils_ltm_soilt_insert.mysql
spwo_insert.mysql
stream_chemistry_raw_insert.mysql
stream_field_measurements_insert.mysql
stream_nutrients_raw_insert.mysql
strmalgi_insert.mysql
strmmean_insert.mysql
strmrun_insert.mysql
tarm_airt_insert.mysql
tarm_depth_insert.mysql
tarm_fline_insert.mysql
tarm_icesurf_insert.mysql
tarm_icet_insert.mysql
tarm_radn_insert.mysql
tarm_rh_insert.mysql
tarm_wind_insert.mysql
tarm_wvapd_insert.mysql
uhdm_airt_insert.mysql
uhdm_daily_wind_insert.mysql
uhdm_icet_insert.mysql
uhdm_monthly_wind_insert.mysql
uhdm_radn_insert.mysql
uhdm_rh_insert.mysql
uhdm_snow_insert.mysql
uhdm_wind_insert.mysql
vaam_airt_insert.mysql
vaam_fline_insert.mysql
vaam_onyxt_insert.mysql
vaam_radn_insert.mysql
vaam_rh_insert.mysql
vaam_soilt_insert.mysql
vaam_surfchange_insert.mysql
vaam_wind_insert.mysql
vaam_wvapd_insert.mysql
vguerard_hydro_insert.mysql
viam_airt_insert.mysql
viam_fline_insert.mysql
viam_radn_insert.mysql
viam_rh_insert.mysql
viam_soilt_insert.mysql
viam_surfchange_insert.mysql
viam_wind_insert.mysql

==Nov 17===
many options, here are three viable options
--refactor the oracle dump
--rework csv exports, combo with mysqlimport
--consume APIs

Status: 

Priority: 

Normal