Image-handling (Part 1)
This post, I want to explore some thinking I’ve been doing around how to handle images in what, to date, has been a virtually text-only blog.
Problem(s)
My whole blogging workflow has been built on managing the content of posts in a channel-neutral way, so that I can switch platforms relatively easily whenever I need to. At the time of writing I am using Scriptogram. Scriptogram does not allow for storage of images so immediately this has limited what I have been able to post! Some of the drafts I have written require images in order to make sense.
However, it’s also been quite a useful problem as it’s made me consider this area properly and raised a number of issues, namely hotlinking.
Hotlinking
Up until this post, only one image appears in my blog: namely the xkcd image I used in my post on writing user stories.
Even though image is hosted outside my own set-up, I was careful to check the terms of usage on xkcd before including the image I wanted to reference in my own post. As it happens, xkcd uses the Creative Commons Attribution-NonCommercial 2.5 License meaning, in layman’s terms, that it’s fine to reuse the image so long as it’s attributed and its use is non-commercial. Check and check. But that wasn’t my only concern.
I also wasn’t sure where to put the image myself if I was to use it. It turns out that xkcd is fine with hotlinking, q.v.:
hotlinking with
<img>
is fine
from the aforementioned terms of usage page.
But I know that this isn’t the case for every site and understandly so, for a number of reasons.
Firstly, by taking the URL for someone’s image and simply dropping into an img
tag’s src
attribute, one takes the image out of context. This means that the image isn’t seen as its creator perhaps intended, nor as the site or article’s author wanted to use it. Reuse of this sort is covered by licenses like the Creative Commons one used by xkcd, which helps to make this particular issue less contentious.
The second reason, however, is more technical. In using an image in this way, hosted on their site but displayed on yours, you are using their bandwidth. This is known as, amongst other things, as hotlinking and can be considered bad form.
Not only are these concerns ones I want to be conscious of whenever I use someone’s image, I also want to be mindful of the best way to approach managing and hosting my own images so that I don’t become “victim” to the same issues.
Solution(s)
Image shorthand conventions
First of all, I need to decide on a set of conventions for referencing images within my posts in the same way that I do for links as described in my previous post on cross-linking within my blog so that I could refer back to previous posts.
The fact that they are a similar problem means there should be a similar solution. The section on images as described in John Gruber’s original outline of Markdown syntax describes a convention whereby the alt-text appears between a pair of squared brackets and the address for the image itself is slotted into the parentheses that follow this, much like a link’s address follows the text to which the link itself is applied. Images are differentiated by a preceding exclamation mark, so that this Markdown code…
![The place I'd least like to live is the farm in the background of those diagrams showing how tornadoes form.](http://imgs.xkcd.com/comics/geography.png)
…should produce this image:
I’ve gone with an xkcd image again because, as shown earlier, I know the site’s author Randall Monroe is cool with hotlinking. Also, it’s a nice image. (You’ll notice I’ve used xkcd’s original alt text here, out of politeness and respect for context.)
When it comes to hosting and referencing my own images, at present I don’t see why I cannot just use a shorthand similar to the one I came up with for cross-linking, e.g.:
![Image of a cat.](/!/image-of-a-cat.jpg)
Instead of using the /$/
which I use to indicate the link is to a local post, I use the /!/
in keeping with the exclamation mark used for Markdown itself, to indicate that the image is local.
Location mapping by service
Having decided on a convention for local images, I then need a way of translating it to the actual location of the image for when it is rendered or published.
First of all, I create a folder called images
at the same level as my drafts
and my archives
folder. I add a README file to this directory to provide an explanation of its purpose for those interested in the abstract workflow and, once this was tracked by Git, I added a note to the .gitignore
file so that any images I add would not be tracked by the workflow repository. As with the drafts
and archives
folder, the images
folder is just a placeholder as far as the remote repository is concerned, ready for people to populate with their own images and content.
Second, I put an image in this folder. I pick one used in my first ever blog post for Small Boats, which was the blog I set up Blogger years ago for book reviews and musings and opinions are various kinds. (That was the intention; as it happened I never blogged very regularly.) I call this image mongolian-flower.jpg
, since it is in fact a picture of a flower I took in Mongolia back in 2009. So far, so good.
Now if I want the image to appear in my blog, I want to be able to simply include the following line in my Markdown:
![A Mongolian flower.](/!/mongolian-flower.jpg)
That way, I know, by convention where to find the file locally.
However, I also need to put the image in a place accessible for visitors to my site. That is, it cannot just sit locally.
For example, I could upload it to a Picasa account, or Flickr or potentially Pinterest. There are a number of image services where I could put the image, but I don’t want to dwell on that now as that’s something I’ll look at in more detail in another post.
As it happens, I know that my image is online because I already used it on another blog. So I need a map that I can use to help the translation of my shorthand /!/mongolian-flower.jpg
to the publicly accessible address online.
I create a comma-delimited file called imagemap.csv
and in that I include a key-value pair, namely:
mongolian-flower.jpg, http://3.bp.blogspot.com/_ojzS0-U0uTw/SxbrmVPxAEI/AAAAAAAAAGs/lLNNTtSNcCk/s400/IMG_4335.JPG
Having a comma-delimited file with information for one image per line allows for the potential to add other properties and attributes later if required. For now though, everything before the comma is the reference I use in my Markdown, the “key”, while everything after is the address of the image online.
Publishing
At the point of the publish, I then convert the key to the actual address so that the image will appear online. To do this, I need to add to my publishing script.
At the moment, I only have a publishing script for Scriptogram (autoscrp.sh
) so I’m going to focus on adding to that.
First, I add some script that will find imagemap.csv
and produce an associative array:
echo "Creating image map..."
declare -A imagemap
while read -r line || [[ -n $line ]]; do
arr=(${line//,/ })
imagemap[${arr[0]}]=${arr[1]}
done < "$DIR"/images/imagemap.txt
Next, I need to parse the file being published and replace any keys with the corresponding values.
While my link replacement was a one-liner, this proved more tricky. I only want to capture image placements that Markdown will actually operate on and not mistakenly pick up those that sit in blockquotes or code examples.
Sadly, it took me all afternoon and a visit to StackOverflow where a useful hint on where I was going wrong give me the answer I needed. The resultant script is this, and it’s pure Bash (no sed
, no awk
, no pandoc
, etc.):
re='(.*^\!\[[[:print:]]+\]\()\/\!\/([0-9a-z\.\-]+)(\).*)'
while [[ $fileparsed =~ $re ]]; do
fileparsed=${BASH_REMATCH[1]}${imagemap[${BASH_REMATCH[2]}]}${BASH_REMATCH[3]}
done
These two bits of script should ensure that I can keep a local copy of the image, reference a hosted copy and have my publishing process take care of the difference.
Result(s)!
Here’s the picture:
Included with the line:
![A Mongolian flower.](/!/mongolian-flower.jpg)
Credit to Avinash Raj on StackOverflow for helping me sort out the final details.
What next?
It’s not over yet. Next, I would like to actually automate the hosting itself. But to do that, I’ll need to look into services that will both host my image and give me API access so that I can upload images as part of the publishing process.