A system for playing SMIL based multimedia contents, comprising: a plurality of SMIL engines for analyzing and interpreting SMIL documents, as well as communicating with and controlling SMIL sub engines, remote media proxies, or local media playing devices; a plurality of remote media proxies for receiving instructions from the upper level SMIL engines, starting or stopping providing media objects to the remote media playing devices, sending back events, and providing basic user interaction capabilities, wherein said a plurality of SMIL engines, a plurality of remote media proxies, and local and remote media playing devices construct a tree-link structure, of which the root node is a SMIL engine, the branch nodes are SMIL engines and remote media proxies, and the leaf nodes are local and remote media playing devices. The corresponding SMIL engines and methods are also provided. The present invention enables the playing of SMIL based multimedia contents on a set of PvC devices, which can be dynamically configured as a new multimedia terminal on demand.