Real-Time Big Data Platform for Distributed Energy Load Forecasting with Computing Approaches

Zainab, Ameema

dc.contributor.advisor	Ghrayeb, Ali
dc.contributor.advisor	Abu-Rub, Haithem
dc.creator	Zainab, Ameema
dc.date.accessioned	2022-07-27T16:39:29Z
dc.date.available	2023-12-01T09:21:47Z
dc.date.created	2021-12
dc.date.issued	2021-11-24
dc.date.submitted	December 2021
dc.identifier.uri	https://hdl.handle.net/1969.1/196325
dc.description.abstract	The proliferation of smart meters in the grids has resulted in an explosion of large energy datasets. Processing such big data is challenging and usually takes a longer time than the requirement of a short-term load forecast. In the era of big data, where information is one of the key factors in making decisions, this study is drawing attention to the need for data management in smart grids. For the utility to be able to plan the resources accurately and balance the electricity supply and demand, accurate and timely forecasting is required. Machine learning algorithms have been intensively applied to perform load forecasting to obtain better accuracies as compared to traditional statistical methods. However, with the huge increase in data size, sophisticated algorithms must be created which require big data platforms with adequate computational resources. Optimal and effective use of the available computational resources can be attained by maximizing the efficient utilization of the computational nodes of a big data platform. Parallel computing is demanded to allow for optimal resources utilization in dealing with smart grid big data. The work in this research addresses the concerns by deploying parallel computing capabilities to minimize the execution time while maintaining highly accurate load forecasting models. This work utilizes multi-node and multi-core processing to minimize the overall execution time of the forecasting models while ensuring acceptable accuracy by mapping simultaneous jobs to available processors. The obtained results demonstrate the efficacy of the proposed approach through real-time adoption of machine learning (ML) models, diminishing execution time, and enhancing scalability. This research will show how tree-based models have outperformed the other models accomplishing a tradeoff between model accuracy and execution time. The proposed approach is validated on real big data provided by Iberdrola, a Spanish utility company. The data is acquired from one hundred thousand different data sources in the electrical distribution system and amounts to 2.2 billion records approximately. To enhance the analysis further, a master-slave parallel computing paradigm for load forecasting is deployed and experimentally verified. The work proposes a concurrent job scheduling algorithm in a multi-energy data source environment using Apache Spark. An efficient resource utilization strategy is developed for optimizing multiple Spark jobs to reduce job completion time. The clustering method is implemented to group the electrical distribution nodes into clusters to reduce the number of required forecasting models, additionally reducing computational time.
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.subject	Big Data
dc.subject	Load Forecasting
dc.subject	Machine Learning
dc.title	Real-Time Big Data Platform for Distributed Energy Load Forecasting with Computing Approaches
dc.type	Thesis
thesis.degree.department	Electrical and Computer Engineering
thesis.degree.discipline	Electrical Engineering
thesis.degree.grantor	Texas A&M University
thesis.degree.name	Doctor of Philosophy
thesis.degree.level	Doctoral
dc.contributor.committeeMember	Bouhali, Othmane
dc.contributor.committeeMember	Masad, Eyad
dc.contributor.committeeMember	Serpedin, Erchin
dc.contributor.committeeMember	Xie, Le
dc.type.material	text
dc.date.updated	2022-07-27T16:39:30Z
local.embargo.terms	2023-12-01
local.etdauthor.orcid	0000-0002-3754-4162

Files in this item

Name:: ZAINAB-DISSERTATION-2021.pdf
Size:: 2.155Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Electronic Theses, Dissertations, and Records of Study (2002– )
Texas A&M University Theses, Dissertations, and Records of Study (2002– )

Show simple item record