How Do Non-Geeks Solve Such Problems?

My photo organization sucks. It’s too difficult to find images and requires an Internet connection. Had I started this expat adventure with a laptop, decent camera and the skill to post process photos I would not have chosen the current scheme.

First thing was to identify which photos are linked from this blog to Photobucket, next was to download them. WordPress doesn’t provide URLs for linked media. I back up the blog as HTML files to my hard drive. All I had to do was load each backed up page in a browser, click on image to get the full size one, right click, save as, and I have a local copy.

It all can be done by hand, but there are 280 linked photos. The only part of that I want to do by hand is writing the script to automate it.

<set geek on>
I grepped for lines containing JPEGs on the blog backup, used sed to add delimiters to the URLs, loaded the result into a spreadsheet which put the URLS all into one column, deleted the other columns, saved a file of the 280 URLs, then used wget to retrieve and save the files.
<set geek off>

It was a slow, ugly kludge. I used the spreadsheet only because my sed skills have deteriorated from lack of use. When such a process is done well, it can be entertaining. Don’t believe me? It’s what the Mark Zuckerberg character in The Social Network do what he calls ‘a little wget magic’ to retrieve student’s photos from the housing websites. Can’t link to YouTube videos that might be in violation of copyright, so you’ll have to go there to see it. Video title is ‘The Social Network Hacking Scene’ posted by user 242mokkahei. It is 1m 26s long.


