Abstract:This paper takes the annual reports of A share listed banks in China from 2010 to 2019 as the research sample, by using the LDA topic model to deeply mine the semantic information of Chinese annual reports and construct the topic measure of the banks’ annual reports, and compare the performance of topic measure with commonly used financial measure, text feature measure and their combined measure with topic measure in detecting frauds of listed banks on a variety of machine learning models. This paper found that the topic content of the Chinese annual report has a certain predictive effect on the frauds of listed banks, and compared with a single traditional indicator, the topic measure can improve the fraud detection performance of the traditional indicators. The results of the study provide direct evidence for the effectiveness of using annual report topic content information and machine learning methods to detect listed banks’ frauds, build a more effective fraud detection measure system for the Chinese market, and find a more efficient method for auditors, which is conducive to further avoiding and preventing audit risks.