A system and method for automatic assignment of medical codes to unformatted data is, for example, a computer software module or engine. The engine automatically assigns medical codes such as ICD codes (ICD9 and ICD10 as well as other versions) to unformatted or uncoded medical documents (e.g. medical notes, discharge summaries, etc.). The system reads a document and then scans (assesses) it for diagnoses associated with the medical codes. When diagnosis is identified, the system can also examine the language context in which the diagnosis appears. Using rules derived from syntactic and semantic usage, the system decides whether to apply an identified ICD code to the document being processed or not. The output of the module, a set of medical codes and the corresponding diagnoses that conform to the widely accepted syntactic and semantic rules associated with coding, can then be stored in or applied to a number of different mediums, such as data base entries, attachments to the document itself, email to the owner of the document, electronic or paper forms, etc.