The Bakken shale formation is one of the largest oil-producing formations in the United States; therefore, significant volumes of formation water are produced. This produced water (PW) is a byproduct of oil and gas extraction and can be hypersaline, as it may have up to ten times the salinity of seawater. Produced water quality varies widely; no two water sources have the same physical-chemical properties. High concentration of anions and cations in PW, precipitates as their scale form when PW come into in the reservoir during recovery processes due to alternations in pressure and pH. Inorganic scale composition continuously changes with the salt concentrations; therefore, prediction of scale formation continues to be a major challenge in the crude oil industry. Machine learning (ML) techniques as a cost-effective option has demonstrated its capability to predict scale formations by analyzing the intricate relationships between water chemistry, pressure, and pH. This predictive power can significantly aid in anticipating scale-related issues and contribute to the selection of optimal treatment technologies, offering a valuable tool for enhancing efficiency and sustainability in the management of produced water in the oil industry.
In this study, we aim to aid in the selection of produced water treatment technology by identifying which solid formation from produced water is likely to occur in different ion concentrations and pH. We trained the saturation index of scales, using the Random Forest (RFs), Linear Regression (MLR), and Extreme Gradient Boosting (XGBoost) techniques on a database comprising 2313 PW’s quality data points from different locations in Bakken Shale area including pH, TDS, ICP values of different inorganic ions, and saturation index of potential scales for investigating the scale formation in produced water samples. The significance of this study lies in deploying this model on a static website without the need for a server. This approach shifts computational requirements onto a website visitor, eliminating the necessity for installation and the need for other computational software licenses.