Notice of Retraction Grammar Error Detection Tool for Medical Transcription using Stop Words Parts-of-Speech Tags Ngram Based Model

Ganesh B R, Deepa Gupta, Sasikala T


Notice of Retraction

After careful and considered review of the content of this paper by a duly constituted expert committee, this paper has been found to be in violation of APTIKOM's Publication Principles.

We hereby retract the content of this paper. Reasonable effort should be made to remove all past references to this paper.

The presenting author of this paper has the option to appeal this decision by contacting


Medical transcription is the process of conversion of audio files, dictated by medical experts, to electronic data files in a predetermined format. The doctor ‘s thoughts are documented, covering medical procedures carried out on a patient starting from the time the patient enters the clinic or hospital, up until the ailment is treated. A grammar checker is an important asset to hospitals to scrutinize medical transcripts. The transcripts are important to track a patient’s medical history and need to be error free. The available existing tools are specifically designed to detect faulty grammatical constructs in the generic English language. It is important to improve the intelligence of a grammar checker in a relatively unknown domain and to improve the level of accuracy set by the existing tools which mostly rely on a set of non-exhaustive rulesets. These are the driving factors to propose a new approach to an old problem. Stop words are most commonly occurring words in any language. By exploiting the fact that stop words form the backbone of a sentence and by figuring out the common parts-of-speech tags which surround them,
a sentence’s grammatical structure can be better understood using statistical methods.


Medical Transcription at Cerebra Integrated Technologies Limited. Retrieved from

Manu Konchady. Detecting Grammatical Errors in Text using a Ngram-based Ruleset. Retrieved from:, 2009.

Y. H. Wang and C. H. Lin. An English sentence parser for grammar error detection. TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering, 2002; vol.1: 445-448.

C. H. Wu, C. H. Liu, M. Harris and L. C. Yu. Sentence Correction Incorporating Relative Position and Parse Template Language Models. in IEEE Transactions on Audio, Speech, and Language Processing, August, 2010; vol. 18, no. 6: 1170-1181.

Ying Jiang, Tong Wang, Tao Lin, Fangjie Wang, Wenting Cheng, Xiaofei Liu, Chenghui Wang. A rule based Chinese spelling and grammar detection system utility. 2012 International Conference on System Science and Engineering (ICSSE), Dalian, Liaoning, 2012; 437-440.

Ying Jiang, Zechao Lin, Junyue Wang, Miaojuan Dai, Liwei Zhen, Ningran Li, Zhouyang Hu, Shuzhou Chen, Yang Meng. Corpus Based Chinese Grammar Error Detection Rules Evaluation Method and System. Intelligent System Design and Engineering Applications (ISDEA), 2013 Third International Conference on, Hong Kong, 2013; 496-499.

N. Ehsan and H. Faili. Statistical Machine Translation as a Grammar Checker for Persian Language. Sixth Int. Multi-Conference Comput. Glob. Inf. Technol., 2011; 20–26.

Hendy Raymond Susanto, Peter Phandi, and Tou Hwee, Ng. System combination for grammatical error correction. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 951–962.

Rozovskaya, A. and Roth, D. Building a State-of-the-Art Grammatical Error Correction System. Transactions of the Association for Computational Linguistics, 2014; 2: 419-434.

Naw Naw and Ei Ei Hlaing. Relevant Words Extraction Method for Recommendation System. Bulletin of Electrical Engineering and Informatics, September 2013; vol. 2, no. 3:169-176.

B. Milovic and M. Milovic. Prediction and decision making in health care using data mining. International Journal of Public Health Science, 2012; vol. 1, no. 2: 69–78.

M. Federico, N. Bertoldi, M. Cettolo. IRSTLM: An Open Source Toolkit for Handling Large Scale Language Models, Proceedings of Interspeech, Brisbane, Australia, 2008.

Bird, Steven Klein, Ewan Loper, Edward Baldridge, Jason. Multidisciplinary instruction with the Natural Language Toolkit. Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics, ACL

Kristina Toutanova, Dan Klein, Christopher Manning, and Yoram Singer. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network. In Proceedings of HLT-NAACL 2003; 252-259.

Kristina Toutanova and Christopher D. Manning. Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), pp. 63-70.

Wikipedia, Retrieved from, September 2016.

KenLM: Faster and Smaller Language Model Queries Kenneth Heafield. WMT at EMNLP, Edinburgh, Scotland, United Kingdom, 30—31 July, 2011.

Katz, S. M. Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1987; 35(3): 400–401.

Stop words list, Retrieved from, September 2016.



  • There are currently no refbacks.

Copyright (c) 2019 APTIKOM Journal on Computer Science and Information Technologies

ISSN: 2528-2417, e-ISSN: 2528-2425

CSIT Stats


Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.