How do you really integrate multiple media?

This has been an interesting journey through the use of media for instruction. When I started the course, I thought of multimedia as referring simply to video or animation with sound. Since I had very little experience with instruction via video or animation, it was not an area with which I felt a high level of comfort which was the reason I wanted to take this course. It had not occurred to me that the term “media” referred to the medium by which instruction was delivered, and could include any medium, from text to still images to audio to live action video. The term “multimedia”, therefore, just meant the integration of two or more of these media together. By that definition, I’ve been creating multimedia instruction all along. It was single media instruction that gave me the real problems in this course, and then learning to incorporate multiple media into one integrated track rather than two separate tracks.
That concept is still a bit confusing to me. To me, the image should reinforce the text, but when I use an image which reinforces the text, what keeps that from becoming a separate track of instruction? What good are images if they create a secondary track that the reader could choose to follow rather than the text-based instructions? Why use text or images at all if you have the option to use video? I’m just a bit confused on all this and not sure how you create one integrated instruction without overlap. I don’t feel like anything I’ve read nor any example I’ve seen has made this any clearer to me.

