For questions regarding your ETD Data submission, please contact:
Sonia Santana Arroyo (IR Coordinator) sonsanta@fiu.edu
Brandie Thomas (ETD Coordinator) bthomas@fiu.ed
Good documentation and metadata along with archiving in a preferred file format, can help ensure continued long term access and re-usability of your data. When submitting data, students should submit the documentation, metadata and data as a compressed file in the supplemental field on their ETD submission form in Digital Commons. Below is a general outline for preparing your data. Each tab provides more details additional details to consider and prepare your data for submission including:
Documentation and Metadata
Good documentation of data can help ensure that data can be understood and interpreted by any user. Documentation should start at the beginning of a project and continue throughout the research.
When submitting data, students are required to include appropriate documentation and metadata in a readme.txt file. The documentation should include:
Metadata, is a subset of core data documentation. Though metadata standards vary across disciplines all metadata provides standardized structured information explaining the purpose, origin, time references, geographic locations, creator, access conditions and terms of use of data.
Along with the documentation of your project and data you must include a metadata.txt file that includes:
Use the Readme_[Last Name].txt to compile your documentation and metadata for submission.
Format
Choosing an appropriate format for the data is also an important aspect of long term access and usability. Below are basic guidelines when preparing data for submission:
Below is a table that outlines acceptable data formats. Data not in these formats will only be accepted if deemed appropriate upon review.
Type of data |
Acceptable formats for sharing, reuse and preservation |
Other acceptable formats for data preservation |
---|---|---|
Quantitative tabular data with extensive metadata a dataset with variable labels, code labels, and defined missing values, in addition to the matrix of data |
SPSS portable format (.por) delimited text and command ('setup') file (SPSS, Stata, SAS, etc.) containing metadata information some structured text or mark-up file containing metadata information, e.g. DDI XML file |
proprietary formats of statistical packages e.g. SPSS (.sav), Stata (.dta) |
Quantitative tabular data with minimal metadata a matrix of data with or without column headings or variable names, but no other metadata or labelling |
comma-separated values (CSV) file (.csv) tab-delimited file (.tab) including delimited text of given character set with SQL data definition statements where appropriate
|
delimited text of given character set - only characters not present in the data should be used as delimiters (.txt) widely-used formats, e.g. MS Excel (.xls/.xlsx), MS Access (.mdb/.accdb), dBase (.dbf) and OpenDocument Spreadsheet (.ods) |
Geospatial data vector and raster data |
ESRI Shapefile (essential - .shp, .shx, .dbf, optional - .prj, .sbx, .sbn) geo-referenced TIFF (.tif, .tfw) CAD data (.dwg) tabular GIS attribute data
|
ESRI Geodatabase format (.mdb) MapInfo Interchange Format (.mif) for vector data Keyhole Mark-up Language (KML) (.kml) Adobe Illustrator (.ai), CAD data (.dxf or .svg) binary formats of GIS and CAD packages |
Qualitative data textual |
eXtensible Mark-up Language (XML) text according to an appropriate Document Type Definition (DTD) or schema (.xml) Rich Text Format (.rtf) plain text data, ASCII (.txt) |
Hypertext Mark-up Language (HTML) (.html) widely-used proprietary formats, e.g. MS Word (.doc/.docx) some proprietary/software-specific formats, e.g. NUD*IST, NVivo and ATLAS.ti
|
Digital image data |
TIFF version 6 uncompressed (.tif) |
JPEG (.jpeg, .jpg) but only if created in this format TIFF (other versions) (.tif, .tiff) Adobe Portable Document Format (PDF/A, PDF) (.pdf) standard applicable RAW image format (.raw) Photoshop files (.psd) |
Digital audio data |
Free Lossless Audio Codec (FLAC) (.flac) |
MPEG-1 Audio Layer 3 (.mp3) but only if created in this format Audio Interchange File Format (AIFF) (.aif) Waveform Audio Format (WAV) (.wav) |
Digital video data |
MPEG-4 (.mp4) motion JPEG 2000 (.mj2) |
|
Documentation and scripts |
Rich Text Format (.rtf) |
plain text (.txt) some widely-used proprietary formats, e.g. MS Word (.doc/.docx) or MS Excel (.xls/.xlsx) XML marked-up text (.xml) according to an appropriate DTD or schema, e.g. XHMTL 1.0 |
Source: UK Data Archive, http://www.data-archive.ac.uk/create-manage/format/formats-table
File Organization
1. Use a Naming Convention
Using a naming convention for your files will assist you and other researchers in using your data. If there are established conventions for your research group and/or discipline you should use them. If there aren't any file naming conventions already being used by your research group/discipline, you can create your own. These are some best practices when creating a naming convention:
2. Group your files into meaningful datasets.
There are three widely used file structures for organizing your data:
Researchers working with human subjects must take additional steps to adhere to local and federal regulations in order to ensure the privacy of those individuals participating in the research projects. This applies not only to medical and health research, but also applies to research projects that include polls, surveys, and focus groups.
The FIU Institutional Review Board (IRB) is a committee established under federal regulations for the protection of human subjects in research (45 CFR 46). Its purpose is to help protect the rights and welfare of human participants in research. FIU faculty, staff, and students are required to obtain IRB approval prior to conducting research with human subjects. This applies to both on-campus and off-campus research, regardless of funding.
Students submitting data obtained from human subjects must follow all privacy and ethical standards set forth by the University, State and Federal governing bodies. When submitting your ETD and Data, you will be required to attest that you have complied with all privacy regulations related to human subjects in your study.
Access/Permission Checklist
All graduate students submitting a thesis or dissertation through the graduate school are eligible to submit and archive their finalized data sets alongside their ETD. All data sets will be made openly accessible through the Library's dPanther system and Digital Commons.
Students submitting their data should review the section "preparing your data for submission" which outline documentation and metadata, format and copyright information.
All data should be submitted as a compressed file in the supplemental filed on their ETD submission in Digital Commons. The compressed file should include:
Review the criteria below to determine if your data is appropriate and ready for submission.
Is your data right for submission?
Data Documentation:
Sharing and Permissions:
You can submit your ETD alongside your ETD in Digital Commons.
1. Follow the submission instructions for your ETD.
2. Before pressing the "Submit" Button, Check the box under "Additional Files"
3. Upload your compressed file. The compressed file should include the Readme.txt along with your data. The compressed file's name will be the title that appears. Be sure that the "show" box is unchecked. Select "Continue" to complete your submission.