Can Transparently's risk engine generate alpha?

Mark Jolley

July 9, 2025

Many have asked whether the AI model developed by Transparently to detect account manipulation can be used in stock selection to generate alpha. This model, called the Transparently Risk Engine (TRE), quantifies the risk that a company’s accounts misrepresent its true financial position.

This note illustrates the potential for increased returns offered by the system. The system adds alpha because accounting quality is surely the best overall measure of corporate governance and good governance is a critical ingredient into any company’s long term success.

This means investors can now leverage AI to construct portfolios that avoid higher-risk companies, potentially leading to significantly increased returns.

In this piece, we will show how a simple ETF based on the TRE delivers consistent outperformance, ultimately allowing individuals to generate substantially more wealth. We can all retire sooner.

What is the Transparently Risk Engine?

The TRE is a combination of a predictive analytics AI engine and a fine-tuned GenAI model to interact with it. The GenAI component is essentially a large language model - an artificial intelligence program that is trained on massive amounts of text data. This training allows the model to understand and generate human-like text, translate languages, answer questions, and perform specific tasks such as looking for patterns that might indicate certain outcomes.

The "large" in the name refers to the size of the model, typically measured by the number of parameters. The more parameters, the more complex patterns the model can learn from the data it receives. Transparently has fine-tuned the model to specifically address accounting issues (albeit it is capable of performing other types of analysis as well).

Transparently’s underlying predictive AI engine receives data in the form of the financial statements of tens of thousands of companies. These statements, entailing millions of data points, are captured by the model as they appeared on specific dates. As the program processes the data, it identifies complex patterns and it learns to understand relationships between these patterns and certain outcomes. In this case we are looking for arrangements of patterns that link to share price collapse due to account manipulation.

Understanding the TRE

The Transparently Risk Engine identifies events and patterns in a company’s financial statements and learns to link combinations of patterns with known cases of accounting fraud. Depending on the frequency and type of patterns observed, the TRE determines the accounting quality of a company.

The TRE develops its assessment using a vast database and is able to identify more complex patterns than the human brain. The system is based on an insane amount of data; currently more than a billion data points from around 85,000 companies. If that is not enough, one can feed additional data into the system such as news reports, spreadsheets, biographies of senior managers, lists of legal advisors, commentaries written by senior managers and so forth.

The system employs this data to develop risk scores based on 14 risk clusters. These clusters pertain to factors ranging from cash quality to corporate governance. The score is not a simple aggregation but a complex non-linear amalgamation of patterns observed within and among the different risk clusters. The scoring allows Transparently.ai to rank companies according to the prospective quality of their accounts.

Measuring the performance

Dr Hamish Macalister and his team at Transparently.ai have undertaken extensive point-in-time (PIT) analysis of the TRE. PIT analysis is a valuable technique for assessing forecasting performance. As the name suggests, PIT involves capturing "snapshots" of data at specific points in time. This approach allows a comparison of the relationship between the risk rankings and companies’ past and future stock price performance at different points in time. Forecasts estimated at different snapshots in time can be evaluated without look-ahead bias (where data reflects future information) and without survivor bias (where data excludes entities that have failed).

Comparison over time in this manner is only possible due to the vast amount of data employed and allows a detailed understanding of the strength and consistency of the relationship between the risk rankings and future stock returns over time. Dr Macalister’s PIT analysis runs to hundreds of pages over several reports. This is far more information than we need for a simple demonstration of the predictive power of the system.

Risk score threshold analysis

Simply put, the risk ranking system allows Transparently to measure the relationship between the risk scores and the future performance of companies. These results can be used to assess how much the AI system can improve investment outcomes. Our goal in today’s discussion is to ask what, if any, increased return might be achieved from a simple ETF based on the TRE.

Threshold analysis is the best way to undertake this kind of question analysis.

Threshold analysis divides companies into different bands or thresholds, based on their risk scores. The thresholds are set at various percentile levels (50th, 60th, 70th, etc) and companies are divided into "Below" and "Above" groups based on these thresholds. At the 70th percentile threshold, for example, the "Below" group represents the 70% of companies with lower risk scores, while the "Above" group contains the 30% of companies with the highest risk scores.

The higher the score, the greater the risk of account manipulation and thus the lower is the prospective accounting quality. On this basis the “Above” group represents lower accounting quality whereas the “Below” group represents higher accounting quality.

The return data consists of future 12-month returns. The data spans from 1999 to 2025. The PIT captures the future 12-month return related to each company’s risk score as each successive snapshot is taken. By comparing the performance of these two groups we can determine if the risk scores effectively differentiate companies with varying return characteristics.The number crunching required to generate these results is quite phenomenal.

The results are global, covering virtually all listed companies with returns measured in US dollars. Figures I and II provide a comparative analysis of the median and mean returns across different risk score thresholds.

For the 50th percentile threshold in Figure I, for example, companies below the threshold (lower risk) demonstrated a median 12-month future return of 6.0% whereas those above the threshold had a median return of -6.1%. By comparison, the returns for the 80th percentile threshold are 2.6% and -20.2%, respectively.

Figure I: Median 12-month forward returns by risk threshold

‍

The future returns for the two groups rises as the threshold increases because we are progressively including more lower quality companies in the “Below” group while the “Above” group becomes increasingly focused on companies with the very worst risk characteristics.

The return performance is similar for the average returns. Average stock returns are higher than median stock returns due to the influence of a few highly successful stocks that skew the average upwards whereas the maximum loss is -100%. The median, on the other hand, is less sensitive to extreme values and represents the middle point of the data.

Figure II: Average 12-month forward returns by risk threshold

At the 50th percentile risk threshold, the lower risk group below the threshold achieved an average 12-month forward return of 14.3% versus 10.5% for the group above the threshold. The difference between the lower risk group and the higher risk group of 3.8% is lower than the difference in the median results. Once again, this is due to the skewed distribution of stock returns. As observed in the median group, returns fall as the risk threshold increases and the difference in returns between the two groups grows.

The bottom line

The consistent pattern of lower future returns for higher-risk companies across all threshold levels provides strong evidence that the risk scores have predictive value for future performance. This finding has important implications for forward-looking investment strategies and portfolio construction.

With the usual caveat that past performance will not necessarily reflect future performance, we should expect that an ETF constructed only of companies with below, say, the 50th percentile risk threshold, should have delivered something in the order of 150-to-200 basis points of additional return versus an ETF indexed to the market.

This might not sound like much but the return pick-up is considerable. For an investor with an average working life of about 45 years, it would mean that a person could retire around 8 years earlier. Alternatively, one could work to 65 and retire on slightly more than double the retirement income.

The example provided here is just the simplest possible use case and yet it has the potential to greatly enrich people’s golden years. The potential for higher rewards will be even greater if we begin to focus on areas where the AI model delivers greatest gains, particularly with respect to different sized companies, companies in specific industries and countries, or matching specific themes.

Mark Jolley

Mark has been an investment strategist for almost 40 years and has advised some of the world’s biggest investment funds, public companies and notable investors. In the mid-1990s, as a global investment strategist with Deutsche Bank, Mark produced a daily note read by more than 16,000 investment professionals including voting members of the Federal Reserve and the European Central Bank. In the 2000s, Mark worked as Deutsche Bank’s Asian strategist and then as strategist for China Construction Bank. He is now an independent analyst.

AI in asset management

Investment management