It's pretty simple depending on what you want to do, and you can actually rather easily learn a lot by just looking at how current images are inserted into pages. For instance, from the Puzzle Creator page:
That is a standalone image, sized to be 100 pixels wide as a thumbnail, left-aligned on the screen, with the word "text" underneath. Fairly basic. We use those to dress up articles if we need to say what it is. For things more obvious, you can remove the "thumb" and get something like this from the Chell page:
That overlays the text onto the image itself rather than in the field provided by the thumbnail; we use this more often than the thumbnail as most things are pretty straightforward, in my experience, here. Finally, we have the gallery template, as seen on the GLaDOS page (with some things removed for the sake of space here):
You can read more in depth about the template on the template's own page, but in short, "Title" is what the gallery is called (not all need this), "lines" is how many lines of text the space under the images should support, height and width influence the images themselves (and thus ultimately how physically large the template is), and the lines where it lists File:xxxxxxxxxx.jpg are where you have the images themselves. The alt1,alt2,alt3 (etc.) text is what is shown when the user mouses over the image, and the text after that is what is shown under it in the box.