Academic Hub Datasets


Deschutes Brewery Dataset: documentation

Quickstart Notebook: preview - download

Deschutes Brewery, the 10th largest craft brewer in the US, kindly shares through our hub their process data. In summary this dataset has:
  • 39 fermenter vessels with status information and up to 3 temperature control zones
  • 13 bright tanks
  • Data from January 2017 up to May 2020

Exercise and Solution Notebooks for Brewery Dataset

  • Exercise 1 - Apparent Degree of Fermentation learning objectives:
    Students will analyze the Apparent Degree of Fermentation (ADF) during the beer-making process, which is a critical process parameter that inform brewers how much a batch has fermented over time. Brewers use this parameter to make a shift from fermentation phase to free rise phase. Through this exercise, students will learn how to build the predictive linear, and predictive piecewise linear model on ADF.
  • Exercise 2 - Beer Cooling Prediction learning objectives:
    Consistent cooling temperature profile of every batch is directly related to the quality of beer production. During the cooling phase of the brewing process, the temperature of the solution in the fermenter drops from 70°F to 30°F. In the learning module of Beer Cooling Prediction, students will visualize and analyze the data, and will build the predictive model of the cooling temperature.
  • Exercise 3: Principal Component Analysis learning objectives:
    Principal Component Analysis is a statistical technique that compresses the dimensionality of large datasets to a few Principal Components to represent the data. In the learning module, students use PCA to determine the anomalous production batch, and the contributing factors for such deviation behavior.

AGL Wind Farms Dataset: documentation

Quickstart Notebook: preview - download

AGL, the largest fully integrated energy and telecommunication company in Australia with generation assets totalling over 11GW of capacity corresponding to approximately 20% of the total generation capacity of Australia's National Energy Market, kindly shared a substancial subset of their wind turbines data:
  • 5 clusters of 10 wind turbines each (50 turbines total)
  • 13 different sensors per turbine, the main categories being:
    • Temperature: outside, 3 drivetrains and nacelle (℃)
    • Power: instant power sent to the grid, updated every 2 seconds (kW)
    • Speed: rotor (in RPM) and wind (m/s)
    • Angular: pitch, relative wind direction and yaw (degrees)
  • 2-year of data for the 2018-2019 period

UC Davis Campus Energy Dataset: documentation

Quickstart Notebook: preview - download

UC Davis has one of the largest university smart campus monitored with OSIsoft software. One main goal leveraging this real-time data is to empower students to drive green behavior. In summary this dataset:
  • Data from 168 buildings
  • All buildings have electricity data (demand, usage, annual cost) and optionally chilled water and steam
  • Building metadata: primary usage, gross square feet, latitude/longitude, utility rate
  • Data from January 2017 up to January 2020

Del Mar College Pilot Plant Dataset: documentation

Quickstart Notebook: preview - download

Del Mar College is a workforce development college located in Corpus Christi, Texas, where they have developed curriculum focusing on real-world, hands-on experience using a glycol pilot plant located on campus. In summary this dataset includes:
  • Two Hand-On Training Unit (HOT-1 and HOT-3) including a distillation column
  • 76 data streams in total including 11 streams with digital states about Running Mode, Running Status and Alarms
  • Data starts October 13, 2020 up to now

Hub Python Library Quickstart Notebook: preview - download

This Jupyter Python notebook introduces the Academic Hub Python Library, an asset-centric and generic module to access all published datasets in the same manner. It demonstrates the sequence of steps to:
  • Login and authenticate with OCS
  • See available datasets and their assets with descriptions
  • Get the data views associated to an asset
  • Requests time-aligned and interpolated data for an asset and its metadata
For a pure Python (version >= 3.6) script running without Jupyter, click this link
Before running the script, execute: pip install ocs_academic_hub