Final Product on Coze store: https://www.coze.com/store/agent/7476279771824095237?bot_id=true

This study explores how to leverage the low-code platform Coze to construct an intelligent workflow for enhancing the quality of English translations of medical texts generated by large language models (LLMs). Given the numerous dimensions for quality evaluation and the specific issues in LLM translations observed by the author during a medical English-Chinese translation internship project, a comprehensive solution based on an intelligent workflow is proposed. The aim is to improve the translation quality of LLMs in three aspects: translation accuracy (semantic and grammatical), term consistency (internal and external standardization), and cultural adaptation (handling of culture-loaded words and translation strategies). This involves establishing an iterative optimization mechanism based on quality scores, utilizing the language capabilities of LLMs for translation, and injecting knowledge to compensate for the LLMs’ lack of background and professional knowledge.
To validate this solution, three workflows are designed for comparison: single LLM translation, iterative optimization workflow, and knowledge-enhanced iterative optimization workflow. The iterative optimization workflow embeds a cycle control module driven by the COMET-Kiwi QE model on the basis of a single LLM to achieve automated quality iteration based on translation quality scores. The knowledge-enhanced iterative optimization workflow further integrates a RAG term base and a PE style guide. If the quality scores of the translation results from the three workflows show a stepwise improvement, and the analysis and comparison of the translations also reveal a stepwise improvement in the three dimensions from a subjective perspective, the proposed solution can be considered valid.
A hybrid evaluation system is adopted, combining COMET-Kiwi semantic scores (70%) and mechanical feature indicators (30%), to comprehensively assess different workflow configurations. The experiment uses the “Classification and Determination of TCM Constitutions (ZYYXH/T157 – 2009)” as the corpus, conducts five test runs, and compares the results with Google Translate as an industry benchmark. Quantitative analysis of the experimental results shows that the iterative optimization workflow performs best, with a score 6.6% higher than that of single LLM translation and 19.5% higher than that of Google Translate. However, the study also finds an obvious diminishing marginal return in the optimization process. Surprisingly, the performance of the knowledge-enhanced iterative optimization workflow is lower than expected. Analysis of the translations reveals that the output of the iterative optimization workflow is slightly better than that of the knowledge-enhanced iterative optimization workflow in the three dimensions, but each has its own advantages. In addition, through qualitative analysis, the study finds significant differences in the contributions of different types of feedback to the improvement of translation quality.
The innovation of this study is mainly reflected in three aspects. First, from the perspective of translator subjectivity, it explores how translators can independently build intelligent translation systems. Second, it examines the synergistic effect of the iterative optimization mechanism and the knowledge enhancement module. Finally, it designs feedback nodes based on error location and analysis, providing new ideas for future iterative optimization of translation systems. Despite the achievements, the study still has limitations, such as the lack of variable separation for the term enhancement and style guide modules, a limited sample scope, and insufficient term base coverage. Overall, this study provides a valuable empirical reference for constructing efficient intelligent workflows in the field of medical translation.
Keywords: Large Language Model; Translation Agent; Iterative Optimization Workflow; Knowledge Enhancement; Medical Translation


