Kids Library Home

Welcome to the Kids' Library!

Search for books, movies, music, magazines, and more.

     
Available items only
Print Material
Author Taverna, Nicole, author.

Title Data curation for machine learning applied to geothermal power plant operational data for GOOML: Geothermal Operational Optimization with Machine Learning : preprint / Nicole Taverna [and thirteen others].

Publication Info. Golden, CO : National Renewable Energy Laboratory, 2022.

Copies

Description 1 online resource (10 pages) : color illustration.
text txt rdacontent
computer c rdamedia
online resource cr rdacarrier
Series NREL/CP ; 6A20-81649
Conference paper (National Renewable Energy Laboratory (U.S.)) ; 6A20-81649.
Note "January 2022."
"Presented at the 47th Stanford Geothermal Workshop Stanford, California February 7-9, 2022"--Cover.
Bibliography Includes bibliographical references.
Funding DE-AC36-08GO28308
Note Description based on online resource; title from PDF title page (NREL, viewed Dec. 19, 2022).
Summary Geothermal Operational Optimization with Machine Learning (GOOML) is a transferable and extensible component-based geothermal asset modeling framework that considers complex steamfield relationships and identifies optimization prospects using a data-driven approach to physics-guided, data-centric machine learning. This framework has been used to develop digital twins that provide steamfield operators with operational environments to analyze and understand historical and forecasted power production, explore new steamfield configuration possibilities, and seek optimal asset management in real world applications. To create, test, and apply the GOOML framework, diverse time-series datasets spanning multiple years were sourced from various geothermal power plant components within several complex real-world geothermal operations. These operations are based in the United States and New Zealand and include a variety of technologies, end-uses and configurations, collectively covering nearly all relevant operating conditions for modern geothermal fields. Datasets were acquired from multiple sources to ensure that machine learning experiments generalized properly to various operating conditions. It was found that the data varied in quality, format, and completeness. To ensure consistency between the various datasets, a standardized data curation process was developed to reliably streamline data preparation. This paper will discuss best practices as learned from the GOOML data curation process which takes the following steps: 1) acquisition of large quantities of data from power plant operators, 2) digestion of data to gain an initial understanding of what is included, 3) data transformation, which includes converting the data into a standardized machine-readable format so that they can be visualized, quality checked, and cleaned, 4) quality assurance and quality control, involving identification of significant data gaps and apparent anomalies through mapping of data features to real world componentry via the GOOML historical model, followed by discussion with modelers and power plant operators to identify additional data needs and to resolve issues, 5) use in machine learning algorithms, and 6) repetition of steps one through five until all data needs are met and data are deemed suitable for producing trustworthy modeling results which may be disseminated, ideally along with the curated dataset. This iterative process is focused on improving the quality of the data rather than tuning machine learning model parameters and supports a shift towards data-centric AI as a means to improving real-world applicability of geothermal machine learning projects.
Subject Data curation.
Machine learning.
Geothermal power plants.
Édition de contenu.
Apprentissage automatique.
Centrales géothermiques.
geothermal power plants.
Data curation
Geothermal power plants
Machine learning
Indexed Term access
accessibility
collaboration
curation
data
data curation
data pipeline
data-centric AI
discoverability
dissemination
DOE
GDR
geothermal
machine learning
open
OpenEI
pipeline
power plant operations
standards
storage
transfer
translation
usability
Added Author National Renewable Energy Laboratory (U.S.), issuing body.
Standard No. 1843837 OSTI ID
0000-0001-7776-2028
0000-0002-6833-3210
0000-0002-9135-655X
Gpo Item No. 0430-P-04 (online)
Sudoc No. E 9.17:NREL/CP-6A20-81649

 
    
Available items only