Data Reconstruction for Groundwater Wells Proximal to Lakes: A Quantitative Assessment for Hydrological Data Imputation

dc.authorid0000-0002-1693-5877
dc.authorid0000-0002-4767-6660
dc.authorid0000-0003-0559-5261
dc.contributor.authorCan, Murat
dc.contributor.authorVaheddoost, Babak
dc.contributor.authorSafari, Mir Jafar Sadegh
dc.date.accessioned2026-02-08T15:16:03Z
dc.date.available2026-02-08T15:16:03Z
dc.date.issued2025
dc.departmentBursa Teknik Üniversitesi
dc.description.abstractThe reconstruction of missing groundwater level data is of great importance in hydrogeological and environmental studies. This study provides a comprehensive and sequential approach for the reconstruction of groundwater level data near Lake Uluabat in Bursa, Turkey. This study addresses missing data reconstruction for both past and future events using the Gradient Boosting Regression (GBR) model. The reconstruction process is evaluated through model calibration metrics and changes in the statistical properties of the observed and reconstructed time series. To achieve this goal, the groundwater time series from two observational wells and lake water levels during the January 2004 to September 2019 period are used. The lake water level, the definition of the four seasons via the application of three dummy variables, and time are used as inputs in the prediction of groundwater levels in observation wells. The optimal GBR model calibration is achieved by training the dataset selected based on data gaps in the time series, while test-past and test-future datasets are used for model validation. Afterward, the GBR models are used in reconstructing the missing data both in the pre- and post-training data sets, and the performance of the models are evaluated via the Nash-Sutcliffe efficiency (NSE), Root Mean Square Percentage Error (RMSPE) and Performance Index (PI). The statistical properties of the time series including the probability distribution, maxima, minima, quartiles (Q1-Q3), standard error (SE), coefficient of variation (CV), entropy (H), and error propagation are also measured. It was concluded that GBR provides a good base for missing data reconstruction (the best performance was as high as NSE: 0.99, RMSPE: 0.36, and PI: 1.002). In particular, the standard error and the entropy of the system in one case, respectively, experienced a 53% and 35% rise, which was found to be tolerable and negligible.
dc.identifier.doi10.3390/w17050718
dc.identifier.issn2073-4441
dc.identifier.issue5
dc.identifier.scopus2-s2.0-86000664696
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.3390/w17050718
dc.identifier.urihttps://hdl.handle.net/20.500.12885/6097
dc.identifier.volume17
dc.identifier.wosWOS:001442245700001
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherMdpi
dc.relation.ispartofWater
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzWOS_KA_20260207
dc.subjectdistribution changes
dc.subjectentropy
dc.subjectgradient boosting regression
dc.subjectgroundwater level
dc.subjectLake Uluabat
dc.titleData Reconstruction for Groundwater Wells Proximal to Lakes: A Quantitative Assessment for Hydrological Data Imputation
dc.typeArticle

Dosyalar