Image-space algorythms have one huge benefit: their costs are independent from the overall complexity of the scene. Proof is that putting a ~35k faces model in the middle of the scene does not affect the frames per seconds count.
For my experience what seem to drain GPU power are (in ascending order of heaviness):
- multisampling - useful for sorting-independent transparency, it's not for free and there's no better choice.
- accumulation buffers - I use them extensively for assembling lighted parts, shadows, and translucency maps: it probably could be done faster with a blend, or using shaders. Can work on it.
- state changes - the most stunning one: an hidden performance bloodsucker, the only one that isn't proportional to the viewport resolution; requires wise programming.
- convolution filters - god, despite the fact they're GPU-bound being programmed via shaders, they slow as hell. Ok, maybe to convolve a multisampled 800x450 window with a 5x5 kernel on a two years old video card isn't a great idea, but there must be a way to improve this performances.