6 datasets

6.1 alaska_lake_data

The Alaska Lake Data was collected as part of a water quality monitoring initiative across various lakes in protected national parks, aimed at assessing the chemical composition and environmental conditions of these unique ecosystems. Researchers took water samples from several lakes, measuring key environmental parameters like water temperature and pH, along with analyzing the abundance of different chemical elements, such as carbon, nitrogen, and phosphorus. By comparing the concentrations of both bound and free elements, the study aimed to understand the health of these aquatic environments and the impact of natural and anthropogenic factors on the water chemistry. The dataset will be used to inform conservation strategies for maintaining the ecological balance in these sensitive regions. - lake (Categorical): The name of the lake from which the water sample was collected. This refers to the sample. - park (Categorical): The park or national park code where the lake is located. This is part of the sample identification. water_temp (Continuous): The water temperature (in degrees Celsius) of the sample at the time of collection. This is an analyte describing an environmental condition of the sample. - pH (Continuous): The pH value of the water, representing its acidity or alkalinity. This is an analyte providing an environmental characteristic of the sample. - element (Categorical): The chemical element being measured in the water (e.g., C for carbon, N for nitrogen). This is an analyte. - mg_per_L (Continuous): The concentration (in milligrams per liter) of the corresponding analyte from the element column, indicating the abundance of each analyte in the water sample. - element_type (Categorical): Describes whether the element is in a “bound” or “free” state, providing context for the form of the analyte.

6.2 algae_data

This dataset was generated as part of a study investigating the biochemical composition of different algae strains under varying harvesting conditions. The goal of the research was to examine how different algae strains and harvesting regimes affect the abundance of various chemical species, particularly fatty acids and amino acids, which have potential applications in biofuel production and nutritional supplements. Replicates were performed to ensure consistency, and a wide range of chemical species was measured to provide insights into the algae’s metabolic profile and its response to environmental or harvesting changes.

  • replicate (Categorical): The replicate number of the sample for the experiment, indicating which iteration of the algae sample was analyzed.
  • algae_strain (Categorical): The specific strain of algae used in the experiment (e.g., “Tsv1”). This refers to the strain from which each sample was collected.
  • harvesting_regime (Categorical): The method or condition under which the algae sample was harvested (e.g., “Heavy” regime).
  • chemical_species (Categorical): The type of chemical species or analyte measured in the algae sample, including various fatty acids (FAs) and amino acids (Aas).
  • abundance (Continuous): The measured abundance of the chemical species or analyte in the algae sample, expressed in a continuous quantitative form (e.g., mg/L or similar units).

6.3 beer_components

This dataset captures the volatile compounds released from different ingredients like barley and corn, likely as part of a study on food aroma profiles. Researchers measured the abundance of specific analytes (such as 2-Methylpropanal) and classified them by chemical group (e.g., Aldehydes). The goal is to assess how different ingredients contribute to the overall aroma by linking each analyte to sensory descriptors, which include odor characteristics such as “Green,” “Pungent,” and “Malty.” The dataset could be useful in food science research.

  • ingredient (Categorical): The ingredient from which the analytes were measured (e.g., “barley,” “corn”). This refers to the ingredient from which the sample was collected.
  • replicate (Categorical): The replicate number of the sample for the experiment, indicating the repetition of the measurement for consistency.
  • analyte (Categorical): The specific volatile compound or chemical measured in the ingredient (e.g., “2-Methylpropanal”).
  • analyte_class (Categorical): The chemical classification of the analyte (e.g., “Aldehyde”).
  • abundance (Continuous): The concentration of the analyte measured in the sample, likely in a quantitative unit such as mg/L.
  • analyte_odor (Categorical): A sensory descriptor for the odors associated with the analyte, listed as a combination of descriptors (e.g., “Green; Pungent; Burnt; Malty; Toasted”).

6.4 hawaii_aquifers

This dataset represents water quality measurements from various wells within an aquifer system, collected as part of a study on groundwater composition. Researchers measured the abundance of different dissolved elements and compounds, such as silica (SiO2) and chloride (Cl), from different wells in an aquifer. The dataset could be used to assess the chemical profile of groundwater and monitor any changes in water quality over time. Note the absence of geospatial data (latitude and longitude) for certain samples, and note that some samples come from the same well and aquifer but have different latitude and longitude coordinates.

  • aquifer_code (Categorical): The code assigned to identify the aquifer system (e.g., “aquifer_1”) that the sample came from.
  • well_name (Categorical): The name of the well from which the water sample was collected (e.g., “Alewa_Heights_Spring”).
  • longitude (Continuous): The longitudinal coordinates of the well location from which the sample was take.
  • latitude (Continuous): The latitudinal coordinates of the well location from which the sample was take.
  • analyte (Categorical): The specific dissolved compound or element measured in the water sample (e.g., “SiO2,” “Cl”).
  • abundance (Continuous): The concentration of the analyte in the water sample, expressed in a quantitative unit (mg/L).

6.5 hops_components

This dataset contains detailed information on various hop varieties used in brewing, including their country of origin, brewing usage (aroma or bittering), and the chemical composition of their essential oils and acids. The dataset serves as a resource for brewers to select hop varieties based on their aroma profiles and chemical content, such as alpha acids and essential oils like humulene and myrcene, which influence the bitterness, flavor, and aroma of beer. The goal of this dataset is to help optimize the hop selection process in brewing for specific flavor profiles and brewing techniques like dry hopping or bittering.

  • hop_variety (Categorical): The specific variety of hop used (e.g., “Cascade,” “Chinook”).
  • hop_origin (Categorical): The country of origin for the hop variety (e.g., “USA,” “England”).
  • hop_brewing_usage (Categorical): The primary use of the hop in brewing, either for aroma or bittering, or techniques like dry hopping.
  • hop_aroma (Categorical): The sensory description of the hop’s aroma profile, which can include terms like “floral,” “citrus,” or “spicy.”
  • total_oil (Continuous): The total essential oil content of the hop, typically measured in milliliters per 100 grams.
  • alpha_acids (Continuous): The percentage of alpha acids in the hop, which contribute to the bitterness of the beer.
  • beta_acids (Continuous): The percentage of beta acids in the hop, which also contribute to the bitterness but degrade more slowly over time.
  • humulene (Continuous): A compound contributing to the hop’s woody, earthy aroma.
  • myrcene (Continuous): A compound that contributes to the hop’s citrus and floral aroma.
  • humulone (Continuous): Another compound found in hop oils, related to bitterness and aroma.
  • caryophyllene (Continuous): A compound contributing to spicy, peppery aromas.
  • farnesene (Continuous): A compound contributing to green, woody, or fruity aromas.