Draft – any suggestions: Google’s cache and its limitations as a backup
This is a draft: Got any thoughts, corrections etc? Let me know (credit will be given, naturally).
I overwrote my wp-content folder as part of my wordpress upgrade - deleting all my blog's images, CSS and theme PHP files. I'll be returning to that ...
Anyway, I, and a few others who answered my plaintive plea for help, thought using the Google cache (the copy of your webpage that it stores) might help. Here's what it can and can't help with as a backup if you've ever deleted your whole website ...
Getting the words back
You can get all your words back from Google's cache (this was no benefit for me - I had all the posts in my database still. But if you need to get the words back, read on).
Obviously, if Google visits your site and finds nothing there, eventually it will replace the version in its cache with ... nothing.
So if you've deleted the words (either static webpages or your underlying database), visit the Google cache asap and start downloading / copying everything you wrote.
If you are more advanced, some people have written scripts to automate this process. I haven't tried these, but if you have a lot of pages you want to find, try this one: Retrieving Google's cache for a whole website.
Getting the images back
Google doesn't store images. When you look at the cache and see them, it's either getting them off your server or else some other cache is serving them up (probably your browser's). The only exception is the thumbnail of your image you see in a google images search which is a google-stored file - so with smaller images, this might be usable (if you can construct a search in google images that makes the picture you want appear).
Anyway, if you see your images in Google's cache, save them asap (right click and save as or just drag to your desktop). Don't leave the page and come back later as they might not be there.
If you don't see images in Google's cache, you could try your browser cache (I'm assuming as it's your site, you've visited the page in question at some point). There's some advice on doing this (I've never tried it) here:
There's some technical stuff about IE here.
Note for wordpress users
I found a large proportion of my images lying around my hard disk and trash file. Hooray. So I rapidly FTPed them across. Here are some things to watch out for if you are this lucky:
- Folders Remember, when you upload images they go into a certain folder. Make sure you put them in the right one (probably a date-based folder structure in your wp-content/uploads/ directory).
- File names If you have multiple images with the same name (EG image53.jpg), wordpress adds a -2, -3 etc when you upload later versions to the same folder (so you don't overwrite them). Your originals on your hard drive will all have the same name. You'll have to unentangle this by manually renaming your versions. I have vowed to always give my images proper names from now on as I had about 11 called image5.jpg in 11 different folders. Grr.
- Sizes If you use the wordpress image resizer (ie you choose a medium or large image as part of the upload), wordpress will rename the image as name-horizontalsize-verticalsize. Your original will be just name. Unless the image is massive, the easiest way round this is to edit the HTML view of a particular page/post in wordpress and strip out the dimensions from the file name. The blog reader's browser will resize the image anyway (not ideal, but better than nothing).
Getting the CSS back
Google doesn't store the CSS. You're stuffed here as far as I can tell unless you've saved a version somewhere.
Getting your theme back
If you've lost your theme (EG the files that generate the pages and include the post, your tags, categories etc etc), then it's impossible to get that back from the Google cache (despite this april fool about a server side decompiler in IE8.1).
If you've deleted a webpage or website, I hope this helps you get it back.
The Google cache isn't magic, and it's most useful for retrieving words. The best advice, of course, is to backup often, and especially before an upgrade. Even better advice for wordpress is to change the upgrade process to separate core files which need upgrading from user-uploaded ones ...