Researchers from Korea University have developed a machine learning framework that predicts solar cell efficiency based on wafer quality, enabling early screening of wafers and optimized production paths. Using more than 100,000 industrial data points, the approach combines predictive modeling, process optimization and explainable AI to support photovoltaic manufacturing.
Researchers from Korea University have a machine-learning model that can reportedly predict cell efficiency based on wafer quality.
“We developed this industrial data-driven machine learning framework that uses more than 100,000 solar cell data points collected directly from a real mass production line,” said Seungtae Lee, the lead author of the study. pv magazine. “The goal is to enable data-driven decision-making and intelligent automation in photovoltaic production.”
“While interest in applying artificial intelligence (AI) to manufacturing has grown rapidly, practical implementations in photovoltaic manufacturing remain limited. By directly utilizing large-scale industrial data, our work demonstrates how machine learning can support autonomous decision-making for smart factories, while retaining human interpretability and operator engagement, in line with the human-centric vision of Industry 5.0.”
The proposed approach is based on three main methodologies: predicting final solar cell efficiency based solely on wafer inspection quality data using machine learning models, enabling early screening of wafers before fabrication; identifying wafer-specific optimal equipment routes, also called ‘golden paths’, through optimization algorithms to improve production yield and efficiency, especially for low-performance samples; and improving interpretability through function importance and SHapley Additive exPlanations (SHAP) analyses, allowing engineers to understand the relationship between process variables and performance results.
The framework enables accurate screening of wafers before further processing. Process path optimization is performed using the Tree-structured Parzen Estimator (TPE), a Bayesian optimization algorithm that efficiently tunes machine-learning hyperparameters and automatically identifies optimal model settings without extensive testing.
The framework also uses the Extremely Randomized Trees (ET) model, an ensemble algorithm for regression and classification, as the objective function.
The study used a dataset of more than 100,000 samples from a PERC solar cell production line that used multicrystalline silicon wafers. Aggressive outlier removal was applied via k-means clustering, an unsupervised algorithm that groups data points into clusters based on similarity, combined with efficiency-based filtering to improve data quality.
The researchers claim that the ET model can achieve high predictive accuracy and provides robustness against noise and high training speed, making it suitable for industrial environments. Defect-related features, including defect area fraction, grain defect area fraction, and dark area fraction, were found to be critical for predicting efficiency.
Additionally, SHAP analysis provided guiding insights, identifying thresholds where functions begin to reduce efficiency. Wet bench was the process step that contributed most to efficiency improvements in optimized “golden paths” for process equipment, improving efficiency, especially for low-performing wafers.
“While the methodology was validated using multicrystalline solar cell production data, it can be adapted to other photovoltaic technologies using the same underlying framework,” says Lee. “In the case of monocrystalline silicon solar cells, the approach is similarly applicable; however, the lack of grain boundaries compared to multicrystalline wafers limits the number of directly measurable quality-related characteristics.”
“A similar methodological framework can be applied to perovskite solar cells,” he concluded.
The new methodology was introduced in “Industrial data-driven machine learning framework for wafer quality-based decision making towards smart solar cell manufacturing”, published in Energy and AI.
In August, the same research group presented a machine learning model for predicting plate resistance in doping processes with phosphorus oxychloride (POCl3) in the production of solar cells. It was found to achieve more efficient and faster optimization of process conditions compared to conventional and expensive trial-and-error methods used in the PV industry.
“We found that the model’s learned representations and predictions are consistent with established physical and theoretical insights, providing confidence in the model’s reliability and interpretability in real production environments,” Lee said. pv magazine at the time. “We believe this methodology can be extended from solar cell production to a wide range of industrial processes.”
This content is copyrighted and may not be reused. If you would like to collaborate with us and reuse some of our content, please contact: editors@pv-magazine.com.
