Predicting future commodity prices with a consistent and reasonable level of accuracy is one of the major challenges for industry analysts and market players. Complex models based on machine learning algorithms can provide a fresh perspective. This Insight demonstrates how machine learning can be applied to CRU data to extract useful insights.
We implemented an ensemble decision tree regression algorithm to assess the monthly 1050 common alloy conversion fee. The model was tested over the last seven months with real-life data. The results are promising and could potentially be scaled to quarterly and yearly forecasts. However, there are also challenges and limitations, which we cover later.
In January 2023, we introduced a section tracking the development of soft economic data in Europe as part of the monthly Aluminium Products Monitor report. The idea was to align several leading economic indicators with the assessed aluminium product conversion fees. After testing, we focused on inflation, consumer confidence, the Economic Sentiment Index (ESI), and Purchasing Managers' Indexes (PMI). The month-on-month changes in sentiment over the year also help us gauge the state of demand for rolled and extrusion products.
Additionally, during our discussions on the state of demand and conversion fee assessment, several price points are usually mentioned, such as the LME aluminium price, Rotterdam ingot duty-paid premiums, and stainless steel prices in Europe. Through experience, we can confirm that all these price points and soft economic data are part of the conversion fee assessment puzzle.
However, the question remains – what is the relationship between these prices, soft economic data and conversion fees? And how can we use these mathematical relationships to evaluate the level of conversion fees in a particular month?
We compared the monthly conversion fees for the 1050 alloy starting from September 2002 with six features – European stainless steel sheet prices, LME 3-month aluminium prices, Rotterdam ingot duty-paid premiums, USD/EUR exchange rate, ESI index and inflation. In total, our dataset consists of 266 observations. The results of the correlation analysis are presented in the heatmap below.
We can see that historically the 1050 conversion fee has had a strong correlation with stainless steel prices (0.7) and LME aluminium prices (0.63). In contrast, the correlation with ingot premiums (0.45) and inflation (0.56) is low to medium, and the USD/EUR exchange rate has a small negative correlation with the 1050 fee. Interestingly, there is a strong correlation (0.8) between stainless steel prices and LME 3-month aluminium prices, suggesting potential multicollinearity.
To further explore the relationships, we built a regression model to predict the 1050 conversion fee based on the six features mentioned above. We experimented with a complex model based on the ensemble decision tree algorithm and compared its statistical outputs with those of a conventional multiple linear regression model.
Machine learning algorithms offer many advantages, such as improved accuracy and the ability to capture non-linear relationships between variables. However, our dataset, consisting of only 266 observations, is insufficient for training a robust model. Overfitting is a major concern as it occurs when a model learns the training data too well, including outliers, resulting in poor generalisation to new, unseen data. This data limitation is also expected, as the Euro was introduced at the beginning of 2000, making all prior data part of a different regime and not comparable.
To address this limitation, we augmented the available dataset and introduced “fat tail” noise. As shown in the summary below, the distributions of the monthly 1050 conversion fee, stainless steel, and LME 3-month aluminium exhibit “fat tails”.
The other challenge for machine learning models is explaining the logic behind their predictions. The ensemble decision tree algorithm we applied is no exception. However, we can calculate feature importance, which measures the contribution of each feature to the model’s predictions. This helps identify which features have the most significant impact on the target variable, which in our case is the 1050 conversion fee.
For our model, the feature importance output is presented below. We can see that both LME 3-month and stainless steel prices significantly impact the prediction of the 1050 conversion fee. This aligns with our expectations and the correlation analysis presented above.
However, it is interesting to note that the importance of the LME 3-month price is greater than that of stainless steel. Instead, the latter is close to ESI’s feature importance level. This is despite the weak (0.33) correlation between the 1050 conversion fee and ESI. Hence, the model might have learned some non-linear relationships between features.
The bottom three features in the chart below are inflation, ingot duty pain premia, and USD/EUR exchange rate. They do not significantly impact predicting the 1050 conversion fee and can potentially be excluded. However, we prefer to keep these features for context and explainability, as our model is not computationally heavy.
Machine learning models are designed to generalise and work on unseen input. By augmenting the data to create a larger set, we were able to split it into training and testing subsets. This allows us to fine-tune the model to find a balance between bias and variance, thereby improving accuracy.
There are many statistical methods available to improve multiple regression models as well. We conducted a high-level comparison between our complex model and a basic multiple linear regression model using the 1050 conversion fee and the six features discussed. As shown in the chart below, our complex model is expected to perform better than the basic multiple regression model. However, the ultimate test should be performed on truly unseen data in real life.
Since March 2024 we have been running an experiment using the developed complex model. Now we have accumulated seven months of results.
We can conclude that the complex model performed reasonably well in five out of the seven months it was tested with real-life inputs. As a next step, we will create longer-term quarterly and yearly forecasts for 1050 fees to be published as part of the Aluminium Rolled Products Market Outlook, based on the longer-term forecasts provided by the CRU Aluminium and Steel teams (LME price forecast, ingot premium, stainless steel price) as well as the CRU Economic team (inflation, ESI, exchange rate), which are published in the Aluminium Market Outlook, Stainless Steel Flat Products Market Outlook and Global Economic Outlook. Our goal is to assess how reasonable the longer-term forecasts would be and how they will change based on different future scenarios.
Request a demo to explore CRU’s Aluminium Market Outlooks in more detail.