Quick Start
About GEOME
TEAM Projects
A team is a specialized research group with settings relevant to members of that group. If you are not part of one of the teams, then proceed to the next-section, "Non-Team Projects". Teams enable users to create projects having a common set of rules, attributes, and controlled vocabulary terms. Join a team workspace if you have been invited to do so and if you agree to use ALL of the attributes, rules, and controlled vocabularies for that team. The team administrator controls all configuration options. To create a project within a team workspace, select "Join team workspace" during the project creation process and then select the appropriate team: The following public team workspaces are available in GEOME and will appear in the dropdown entitled "Existing Team Workspace":NON-TEAM Projects
If a project is not created as part of a team, the project owner may configure available attributes, rules, or controlled vocabularies during project creation, or later by visiting the “project configuration” option in the workbench. After pressing "Next" (see screenshot above), you will have the choice to either create a single sheet project or a multiple sheet project. A single sheet project is conceptually simpler to implement as all event, sample, and tissue metadata is entered onto a single sheet. The advantage of a multiple worksheet project is that you reduce duplicate data entry for "parent" entities (You only need to enter a row 1 time and the uniqueKey can be referenced in other worksheets). The disadvantage is that it requires more work on the users part by needing to associate identifier keys between sheets. After you choose your project configuration (single or multiple sheet), press the "Next" button and you can choose which modules to add. Select the modules applicable to your project needs. Please note that all projects have one owner, who may invite additional members. Each of the members in turn can create expeditions within a project. Read further to understand how expeditions work.Expeditions
Projects are composed of one or more expeditions. An expedition corresponds to a single spreadsheet, containing all related events, samples, and tissues. All data entered into GEOME must be entered as an expedition. Any member of a project may create an expedition when they first upload a spreadsheet. The expedition owner retains the right to update or alter expedition data as well as setting the expedition to public or private viewing. The project owner also has the capability to alter expedition metadata of any user within the project. Expedition identifiers can be set as unique either within the expedition itself or across the project. Finally, each expedition provides a globally unique and resolvable prefix (expedition root identifier) for each entity. When a local identifier, which is enforced as unique either within an expedition or project, is appended to the expedition root identifier, it services as a resolvable and globally unique representation for each instance of a collecting event, sample, or tissue. The provision of these identifiers happens automatically, and is noted within the system as BCIDs (biocode commons identifiers).Modifying Projects: creating rules and adding attributes
After you create your project in the project wizard, you may further customize your project using the "Project Configuration" tool. First, make sure your project you wish to modify is the currently active project by using the project navigator on the top of pane or browsing to the project using the Workbench "View Projects" tool. Second, click on "Project Configuration" under "Admin" on the left-hand panel on the workbench. Here, you can click on "Attributes" to add and remove available attributes for each entity, or "Rules" to add and remove rules for each entity. You may also customize the behavior of each entity (e.g Event, Sample, Tissue) by clicking the edit icon next to the entity. Note that at this time you may not add or remove Entities from the Project Configuration interface. This will require editing the JSON configuration file directly, which requires advanced/developer knowledge.GEOME requires the following fields to be entered for ALL projects:
- materialSampleID
- yearCollected
- country
- locality
Notes:
- Locality is required but not decimalLatitude/decimalLongitude as the coordinates of a collected sample may not be, or cannot be, stated.
- Each team workspace or project may choose to enforce additional rules which extend the GEOME network rules.
- GEOME provides the following controlled vocabularies which all projects must adopt: country, container, horizontalDatum, lengthUnits, lifeStage, markers, phylum, preservative, relaxant, sex, taxonRank, basisOfIdentification, and weightUnits. FASTQ metadata controlled vocabularies include: libraryLayout, librarySelection, librarySource, libraryStrategy. Some of the controlled vocabularies accept the value of 'Unknown' in cases where none of the listed options are suitabled.
- The technical specification for the GEOME network is represented in javascript object notation (JSON) format at https://api.geome-db.org/v1/network/1/config
The GEOME R package is used to retrieve GEOME data for analysis. Please visit our github page to run the current code. The GEOME R package is no longer available under CRAN and only available using the github installer.
All data uploaded to GEOME can be manipulated in a laboratory information management system (LIMS) using a specially built LIMS plugin that operates within the Geneious environment. The purpose of the LIMS tool is to help manage lab and sequence analysis workflows.Use the LIMS plugin wiki to learn about how to use the LIMS.
- Field Database Connection: GEOME FIMS
- Host:https://api.geome-db.org/
- Username/password: User your GEOME username and password to access your data in the LIMS system.
GEOME offers the following enterprise-level services under contract:
- Creating and customizing team environments
- Technical assistance with GEOME as well as the Biocode LIMS plugin
- Loading photos and media
- Training
- Customized installations
Contact us to discuss options.
GEOME has its roots in the Moorea Biocode project database, developed from 2006 to 2011 to support data collection for the Moore Foundation funded Moorea Biocode Project: an all taxa biotic inventory of a single tropical island involving 6 teams, 50 researchers, and thousands of collecting events. The Moorea Biocode field information management system, also known as "FIMS1", was developed to ingest spreadsheet data from researchers working on the project and employed data validation on data ingest. The tools were written in Perl and Java. FIMS1 has been running from 2006 through 2018 and developed tools such as the Plate Matcher (to easily map tissues to 96 well plates), the bioValidator (for loading and validating spreadsheets), and a web interface for managing data.
From 2012 to 2015, the National Science Foundation funded the BiSciCol (Biological Sciences Collections) project. While BiSciCol focused on identifiers, ontologies, and semantic technologies for linking biodiversity data, the primary use case was to work on a solution for linking events, samples, and tissues across many different systems using linked data technology with the Moorea Biocode Project data integration with member institution databases. BiScicol brought us the following products: a clearer understanding of the role of persistent identifiers, the BiSciCol triplifier, the development of the Biological Collections Ontology (BCO), and the development of FIMS2, also known as the BiSciCol FIMS. The BiSciCol FIMS adopted ARK identifiers (through California Digital Library's EZID system) for all samples, events and tissues. In addition, all data inserted into BiSciCol FIMS was backed by a Fuseki triplestore with metadata configurations specified using an XML configuration file, stored and managed separately by each project.
From 2014-2017, BiSciCol FIMS began hosting the Diversity of the IndoPacific Network (DIPNet) to develop a coherent metadata repository for assembling metadata from across the IndoPacific region. DIPNet added the ability to upload FASTA and FASTQ metadata to the FIMS2 site and also developed the name "GEOME" (eventually becaming the brand name for FIMS3). During this time-frame, another FIMS system was developed for the Smithosnian National Museum of Natural History ("NMNH FIMS") with the goal of creating a centralized field data ingestion system for all NMNH field data. The NMNH FIMS has since been discontinued with SI users using the BiSciCol FIMS (Barcode of Wildlife Project, ARMS portal, and Global Genome Initiative) and FIMS1 installations (Invertebrate Zoology and LAB applications).
Beginning in 2016, John Deck and RJ Ewing started work on a redesigned FIMS system, utilizing a Postgres backend to control project management features and JSON metadata objects to store data and configurations. This FIMS system (FIMS3) assumed the name GEOME. It incorporates features from FIMS1 (including photo uploading, specimen, tissue, and event pages, plate-matching tools) along with features of the BiSciCol FIMS (persistent id generation, flexible project configuration with attributes stored in a non-relational system guided by a configuration file). In addition, FIMS3 brings in the concept of a network which governs multiple configuration templates and the ability to easily create projects through a user interface. The goal of GEOME is to integrate all previous FIMS1 and FIMS2, with FIMS1 and FIMS2 projects removed from their current hosting environment by Spring of 2019.
- To advance genetic diversity research worldwide
- To aggregate sample and genetic data in raw formats in a searchable database such that original datasets can be utilized for further investigation.
- To promote and advocate open and collaborative science as a best practice for conducting biodiversity research
- To promote and provide capacity-building within developing countries for monitoring, study and protection of their biodiversity resources.
Recalling that access to and utilization of genetic resources and data taken should be consistent with the provisions of the Convention on Biological Diversity (CBD) taking into account their specifications by the Bonn Guidelines on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits arising from their Utilization, and, where appropriate, the Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits arising from their Utilization (NP),
Recalling that according to these provisions non-monetary and/or monetary benefits from the utilization of the genetic resources shall be shared with the Country of Origin if the same so requires and as it is set out in mutually agreed terms,
Acknowledging that research and development on genetic resources can be for the public domain (non-commercial) or for commercial purposes and,
Recalling that according to these provisions, non-commercial research purposes may contribute to the conservation and sustainable use of biodiversity
By uploading data to or downloading data from the GEOME database, GEOME contributors and Recipients of GEOME data agree as follows:
ARTICLE 1 – UPLOADING OF LEGAL AND ACCURATE GENETIC DATA AND ASSOCIATED METADATA
By uploading Genetic Data and associated metadata to GeOME database, Contributors certify the following:
1.1 The Genetic Resources from which Genetic Data and Associated Metadata were derived were collected under appropriate and legal access permits or their equivalent by each Country of Origin, and
1.2 All co-authors on the data’s primary publication have consented to share the data in the GEOME database.
ARTICLE 2 – USAGE OF DOWNLOADED GENETIC DATA AND ASSOCIATED METADATA
2.1 The Recipient shall be entitled to the Use of the Genetic Data and Associated Metadata for the Public Domain.
2.2 Should the Recipient intend to utilize the Genetic Data and Associated Metadata for Commercial Purposes they will seek consent of the Country of Origin, subject to the mutually agreed terms between the GEOME contributor and Country of Origin.
2.3 When used for Commercial Purposes or in the Public Domain, data should be recognized as follows: “Data were made available through GEOME (https://geome-db.org/)”.
2.4 Every attempt should be made to cite the original studies having contributed data to the new analyses. Attribution would include any publications first describing these data as well as reference to electronic depositions (such as DOI associated with dataDryad). This information may often be found in the “associatedReferences” field of the database.
2.5 The Recipient may transfer downloaded Genetic Data and Associated Metadata to Subsequent Recipients provided that Subsequent Recipients agree in writing to use the data under the terms of this agreement.
2.6 GEOME does not undertake to monitor the rights of any Country of Origin for their Genetic Resources downstream from our database portal.
The GEOME Steering Committee is responsible for guiding the development of GEOME software, its implementation, and setting strategic direction. The GEOME steering committee consists of Neil Davies (UC Berkeley), John Deck (UC Berkeley and Biocode, LLC), RJ Ewing (Biocode, LLC), Rob Toonen (University of Hawaii), Cynthia Riginos (University of Queensland), Eric Crandall (CSU Monterey), Libby Liggins (Massey University), Chris Meyer (Smithsonian National Museum of Natural History), and Michelle Gaither (University of Central Florida).
"The Genomic Observatories Metadatabase (GEOME): A new repository for field and sampling event metadata associated with genetic samples", John Deck , Michelle R. Gaither, Rodney Ewing, Christopher E. Bird, Neil Davies, Christopher Meyer, Cynthia Riginos, Robert J. Toonen, Eric D. Crandall Published: August 3, 2017 https://doi.org/10.1371/journal.pbio.2002925
Development of GEOME has been supported by the Gordon and Betty Moore Foundation, the National Science Foundation ( DEB-0956426 and DEB-1457848), the Berkeley Natural History Museums, and contracts from the Smithsonian National Museum of Natural History.