A
system, method, and computer readable medium storing a
software program for translating a script for an
interactive voice response system to a script for a visual interactive
response system. The visual interactive
response system executes the translated visual-based script when a user using a display telephone calls the visual interactive
response system. The visual interactive response
system then transmits a visual menu to the display telephone to allow the user to select a desired response, which is subsequently sent back to the visual interactive response system for
processing. The voice-based script may be defined in voice
extensible markup language and the visual-based script may be defined in
wireless markup language,
hypertext markup language, or handheld device markup language. The
translation system and program includes a parser for extracting command structures from the voice-based script, a visual-based structure generator for generating corresponding command structure for the visual-based script, a text prompt combiner for incorporating text translated from voice prompts into command structure generated by the structure generator, an
automatic speech recognition routine for automatically converting voice prompts into translated text, and an editor for editing said visual-based script.