Indeed is just an exercise, and why don’t you follow the instructions? The images don’t have to be doing any funny motions, they just have to be connected.
The images don’t move, but you can change your point of view to make it more memorable, rotating them etc. (not rotating the images independently but rotating the “connected images”)

(this image is from another post of mine from almost 2 years ago)
And the difference…
free association method:
You generate the images on the spot by “free association” like the things you mentioned (saucer, piece of meat, bottle)
Peg list: A list of items you have memorized beforehand to memorize other stuff connecting it to the items of your list.