A
visual recognition of
user interface objects on computer to recognize and localize objects on a computer screen such as input fields, buttons, icons, check boxes, text, and / or any other basic elements. A
system captures the screen to an image, analyzes the image, and creates a
layout with new virtual objects of the screen. The
system captures the screen on a time basis like a
movie camera as a
bitmap. From the
bitmap, the
system generates lists of lines found on the screen, in which each line has properties such as length, color, starting point, and angle, for example. From the lines, the system creates rectangles found on the screen. From the
bitmap, the system also searches each text element on the screen, and converts each text element to
Unicode text. From the bitmap, the lines, the rectangles, and the text found on the screen, the system creates virtual objects that represent a one-for-one correspondence with each object found on the screen.