Although using the Fast Fourier Transform as a filter to remove specific low spatial frequency noise problems is very useful, it will also remove any wanted signal at the same frequency. So will like all noise filters have an unwanted effect on certain elements of the image. Also the truth is that using FFT in this situation is never going to be a simple one step solution. It is theoretically possible to produce a FFT filter for each individual camera body using a dark frame exposure, that fails to take into consideration that the actual noise present is proportional to the relative level of exposure. So we are back to having to build the FFT filter from scratch for every exposure
I would also strongly disagree with your statements on the advantages provided by the Sony sensor design over the more traditional sensor designs that Canon are still using. Yes at the level of the sensor sensels the noise introduced, remembering that of the effectively three dimensions that the sensor is recording, it is directly digitising the signals in two of those dimensions. So it is only the intensity that is being recorded in an analogue manner from the sensor. As these signal voltages come out of the actual sensel on the sensor unit the SNR is going to be about the same for any sensor of similar dimensions. Sensel area will define how much of an effect the random arrival of the photons adds to the noise in areas of constant tone. The same will also apply, but in reverse to the amount of quantisation noise might be introduced in areas of high image detail. There are other sources of noise as well which are also going to be fairly constant regardless of manufacturer, for any particular basic sensor technology.
It is the next stage that Sony has the advantage over other sensor manufacturers. The signal is passed to an analogue amplifier stage which is used to produce the different ISO equivalent values. Generally for base ISO the gain is one, meaning there is no amplification. For each stop of ISO gain the analogue gain is doubled, so assuming a base ISO of 100, that would have a gain of 1, ISO 200 would have a gain of 2, ISO 400 would be 4 and so on. Remember though that you are doubling both the signal, and the noise, so eventually you get to a point where the signal is so small that the inherent noise (the noise floor) is bigger than the signal. This is the point where you stop increasing the analogue gain, as it no longer gives you an advantage. From the amplifier the signal is passed to the Analogue to Digital converter, where it is converted into a number between 0 and 16383 (for current 14 bit ADCs) proportional to the number of photons arriving at that sensel during the exposure time. For Base ISO that proportionality is about 1. The issue is that in passing the signal from the sensel to the amplifier and from the amplifier to the ADC the signal passes along conductors. Unfortunately these conductors act as little antenna, and so pick up radio waves, which add small amounts of voltage to the signal. The longer the conductors, the more noise will be collected on route. What Sony have done is to move the analogue amplifiers and the ADCs on to the same bit of silicon as the sensels. This significantly reduces the length of the analogue interconnections, and so significantly reduces the amount of noise introduced in to the signal.
It seems that Sony have a Patent for this idea, although integrating circuitry has long been a common aspect of electronic design. There are also drawbacks to adding additional complexity to any particular chip design, the more components that you have, and the larger the area of the chip, which are usually directly linked, the higher the failure rate during production. One reason that FF sensors will always be significantly more expensive than crop sensors is the size. Double the area of a chip and you are four times more likely for it to have a fault. So for Crop to FF which are 2.56 times the area of a crop sensor the failure rate will be 6.5× higher. So adding extra components to the sensor chip is always a risky proposition.
As far as an optimal design for a digital sensor chip you would want to be able to build the chip with the sensels on the front surface, and then use the rear surface to build both an analogue amplifier, and ADC unit directly behind each sensel. This would produce a sensor with the best possible SNR using current silicon based technology. The down side would be massively more complex silicon designs, which the attendant increase in production failure rates pushing the costs.
So the advantage for the Sony sensor is not that it has a clever system to reduce the noise, it simply adds less noise to the signal. No system of noise reduction will ever be as good as NOT adding the noise in the first place.
Alan