Currently, <caption> does not render the caption inside an image, only as text below the image or other media. MultimediaViewer does not support this as it expects a caption inside the image a la [[File:filename|caption]]. It would be beneficial to have the contents of <caption> (safely) embedded in the media itself if possible.