There are two common misconceptions about diffraction. The first is that diffraction can cause small pixels to actually have lower resolution than large pixels. The second is that once diffraction affects the image, there is no further benefit possible from smaller pixels.
The most frequently misunderstood factor in diminishing returns is diffraction. As pixel size decreases, there are two points of interest: one at which diffraction is just barely beginning to noticeably diminish returns (from 100% of the expected improvement, to, say, 90%); and another where the resolution improvement is immeasurably small (0%). One common mistake is to think both occur at the same time, but in reality they are very far apart.
Someone who shoots the 40D at f/5.6 will get the full benefit of upgrading to the 7D. The returns will be 100% of the theoretical maximum improvement. Someone who shoots the 40D at f/11 will *not* get the full improvement. The returns will be diminished to, say, 50%. Someone who shoots the 40D at f/64 (for DOF) will not get any increased resolution at all from the 7D. The returns have diminished to 0%.
Under no circumstances will the smaller pixel ever be worse, and usually it is at least somewhat better, but sometimes is only the same. When the returns diminish to 0%, it means that the sampling rate is higher than the diffraction cutoff frequency (DCF). This is different from the Diffraction Limited Aperture (DLA).
Diffraction is always there. It's always the same, no matter what the pixel size. When the f-number is wider than the DLA, it means that the image is blurred so much by large pixels, that it's impossible to see the diffraction blur. Smaller pixels simply allow you to see the diffraction blur that was always there.
The DLA is the point at which diffraction *starts* to visibly affect the image. It is not the point at which further improvement is impossible (the DCF). For example, the diffraction cutoff frequency for f/18 (in green light) is 4.3 micron pixels (the 7D). So if you use f/18, then you can upgrade to the 7D and still see a benefit. For example, if you compare the 50D and 7D and f/11, you'll see an improvement in resolution, even though the 50D DLA is f/7.6.
Another important factor is that diffraction can be deconvolved in software! Normal sharpening helps, but specialized algorithms such as Richardson-Lucy are really impressive, and there are several free raw converters that include that option. There are two important limitations: it doesn't work well in the presence of high noise power (at the sampling frequency), and we don't have the phase information of the light waves. The practical result of these two factors is that RL deconvolution works great at ISO 100 for increasing contrast of frequencies below the diffraction cutoff frequency, but it cannot construct detail higher than the cutoff.
Even at f/11, the 7D offers *some* improvement over the 40D. Diffraction will never cause the 7D to have *worse* resolution. But in extreme circumstances (e.g. f/22+) it will only be the same, not better. At f/11, the returns will be diminished so that the 7D is only somewhat better. (If you use the special software below, you can get those returns back.) In order to enjoy the full benefit of the additional resolution, one must avoid going past the DLA ("Diffraction Limited Aperture").
Let's compare the XT and 7D. The maximum theoretical improvement in linear resolution that would be possible going from 8 MP to 18 MP is 50% (sqrt(18/8 ) or 5184/3456). That means if the XT can resolve 57 lp/mm, then the 7D could resolve 86.4 lp/mm (50% higher). But that would only be true when you stay under the DLA. At f/5.6, you should be able to get the full 86.4 lp/mm. But at f/11, you will get something in the middle (say, 70 lp/mm). At f/18 you're back down to 57 lp/mm again. (For green light. Blue has less diffraction and red has more.)
This comparison (thanks to The-Digital-Picture.com) is the 5D (12 MP) with the 1Ds Mark 3 (21 MP) using the EF 200mm f/2.8. The 5D has a much weaker AA filter, relative to the pixel size, than the 1Ds3, so that will skew the results in favor of larger pixels looking better.
I have simulated the same print size by re-sizing the center crops with a good algorithm. Do not examine the thumbnails below: you must click on the thumbnail to see the full sized image. (The thumbnails themselves are not intended for analysis.)
Set "f/5.6" is below: The 5D and 1Ds Mark III at f/5.6. There is no visible effect at all from diffraction in either camera. The aliasing/debayer artifacts (green and color patterns) are a natural result of the weakness of the anti-alias filter. As expected, the 1Ds Mark III, with over 50% more pixels, has higher resolution. This set establishes a baseline of how much improvement is possible when there is no diffraction at all. (Some people have a hard time seeing the difference between 12.8 MP and 21 MP, so look carefully.)
5D @ f/5.6 (link due to eight image limit)
Set "f/8" is below: The 5D and 1Ds Mark III at f/8.0. Diffraction is beginning to have a very slight effect here, which is noticeable on the 1Ds, but not the 5D. It is softening the very highest frequency of detail. The 5D's 8.2 micron pixels add too much of their own blur for the diffraction to be visible.
[IMAGE'S LINK: http://thebrownings.name ...ffraction/500-5d-f8.0.jpg]
[IMAGE'S LINK: http://thebrownings.name ...action/500-1dsm3-f8.0.jpg]
Set "f/11" below: The 5D and 1Ds Mark III at f/11.0. Now diffraction is very obvious, even in the 5D. But it's plain that the 6.4 micron pixels still resolve more detail.
[IMAGE'S LINK: http://thebrownings.name ...fraction/500-5d-f11.0.jpg]
[IMAGE'S LINK: http://thebrownings.name ...ction/500-1dsm3-f11.0.jpg]
Set "f/16" below: The 5D and 1Ds Mark III at f/16.0. This focal ratio results in a *lot* of diffraction, as you can see. However, you can still see that the 21 MP provides more detail than the 12 MP. The difference isn't as large as f/5.6, above, but it's there. Returns have diminished, but not to 0%.
[IMAGE'S LINK: http://thebrownings.name ...fraction/500-5d-f16.0.jpg]
[IMAGE'S LINK: http://thebrownings.name ...ction/500-1dsm3-f16.0.jpg]
Furthermore, note that in all the cases above, the higher megapixel camera provided more contrast (MTF) in addition to the increased resolution. Yet this is with very little sharpening ("1" in DPP) applied. RL deconvolution would greatly increase the contrast in the diffraction limited images.
To summarize: smaller pixels always provide the same or higher resolution. Diffraction limited is not the same as the diffraction cutoff frequency. The cutoff is probably much higher than you think.