back to list

FFT analysis

🔗jpehrson@...

8/9/2001 8:24:42 AM

--- In crazy_music@y..., jacky_ligon@y... wrote:

/crazy_music/topicId_unknown.html#996

When you play a sound lower than the original, you can
> hear the overtones in the sound more clearly, and when you play in
a scale derived from FFT analysis of the actual sound, you play
chords that are sort of more in tune with the timbre.

Hi Jacky!

Could you please explain how you do the FFT analysis (Fast Fourier
Transforms, I think), what it is, and what software you use?? I know
you've explained this before, but I obviously need a "refresher..."

Thanks!

___________ __________ _____________
Joseph Pehrson

🔗jpehrson@...

8/9/2001 12:14:03 PM

--- In crazy_music@y..., jacky_ligon@y... wrote:

/crazy_music/topicId_unknown.html#1023

> --- In crazy_music@y..., jpehrson@r... wrote:
> > --- In crazy_music@y..., jacky_ligon@y... wrote:
> >
> > /crazy_music/topicId_unknown.html#996
> >
> > When you play a sound lower than the original, you can
> > > hear the overtones in the sound more clearly, and when you play
> in
> > a scale derived from FFT analysis of the actual sound, you play
> > chords that are sort of more in tune with the timbre.
> >
> > Hi Jacky!
> >
> > Could you please explain how you do the FFT analysis (Fast
Fourier
> > Transforms, I think), what it is, and what software you use?? I
> know
> > you've explained this before, but I obviously need
a "refresher..."
> >
> > Thanks!
> >
> > ___________ __________ _____________
> > Joseph Pehrson
>
> Joseph,
>
> I use an app called Softest, and I use the Spectrogram for this.
>
> You can play the wave file, and see the partials of the sound in
real
> time. Pitch is horizontal - amplitude is vertical. Then you can use
> this little cross-hair tool, which snaps to the partial you hold it
> over, and right click on it and copy the values to the clipboard
> (amplitude and frequency), then into Excel, where I convert it to
> cents. Very handy!
>
> Also, for Bell and Gong samples, I have used "Wavanal", which can
do
> resynthesis, so you can hear the results of the FFT. Does a good
job,
> and is freeware used by church bell tuners. Robert Walker turned me
> on to this one. Look it up in Google, and check it out. Works well
> with some other sounds too I've found. Has nice ability to export
FFT
> as .par files for Excel.
>
> Jacky

Thanks so much, Jacky, for this information... Yes, I
found "Wavanal" but can't find "Softest..."

I looked up "Softest" and got "The Softest Kind of Love..." which, I
don't believe is what I'm looking for... I mean at least on this
forum...

____________ ___________ _________
Joseph Pehrson

🔗Robert Walker <robertwalker@...>

8/27/2001 7:12:01 PM

Hi Brian,

Perhaps I can respond to this as I've recently been programming FFT analysis
with FTS, so needed to find out about it for that. I hasten to add that this
isn't something I've researched myself; I'm just quoting various
results that I looked up for the programming, plus my experience in
using these results in FTS.

When you use the standard FFT algorithm, it produces frequency
bins, and for short sound clips the bins are so far apart as to give
very low frequency resolution. For instance, an 0.1 second clip at 440 Hz
gives a freq resolution of +- 10 cents - not very good!

However, one can improve the accuracy considerably, maybe achieve a
tenfold improvement of accuracy, by using peak interpolation. The method is
to look not just at the highest amplitude of the frequency bins, but also
the two bins to either side. Then by fitting a curve to it, you can estimate
where the real peak should be, on assumption that it is a response to a
single frequency partial.

I find it gives quite reasonable accuracy even for fairly short notes, say,
a tenth of a second. One can test it by making a waveform with some random frequencies
accurately pitched (and between FFT frequeny bins), and then see
if the FFT with the peak interpolation recovers them, and FTS has this
programmed in as a test to try out to see how well the FFT works.

To give a rough idea - of the five interpolation methods I've tried, I find
Jain's method is the one that seems to work best for the frequencies one is
interested in.

For tenth second random frequencies at around 440 Hz, it often gets the
pitch accurate to within 0.1 cents, but occasionally goes wild by about
1 to 2 cents. So ten times better accuaracy consistently, and most of the time
it has a hundred fold improvement in the accuracy. Pretty good!

For a 1 second clip, one can get better results, typically reliable to within
0.1 cents nearly all the time with Jain's method.

All the FFT programs use the same maths for the FFT, with tweaks to speed
up the calculation (which won't affect the results), and minor variations
to do with the way the sharp cut off of the sound at start and end of the clip
is treated (perhaps presented as a drop list of "windowing" functions to use).

HOwever, one could expect variations in the accuracy of freq detection depending
on whether they use peak interpolation, and what methods they use for this.

Now for the other point about whether the partials that show up in FFT are
"real" or artefacts of the FFT.

Fourier's result is that a wide range of mathematical curves
can be represented on a finite domain as a sum of sine waves.

This includes all continuous functions, and a a large number of discontinuous ones
(including all piecewise continous ones - the ones with only a finite
number of discontinuities).

The idea of the Fast (or discrete) Fourier Transform is to do the same thing as the
continuous fourier transform, but for a function defined on a finite
number of points (the samples).

If one adds together all the frequencies found in all the bins,
and also keeps the phase information of the FFT (which is preserved in
the angular component of a complex FFT transform, but lost if one uses
the faster real transform), then one will recover the original wave form
exactly for the sample points.

However, because of the discrete bin sizes, the only way this can work is
by introducing "artefacts" , e.g. extra frequencies that add together to
give a frequency between two bins.

The usual FFT artefacts have a characteristic pattern to them. They normally
show up as a widening of the peak, with the tails of the peaks spreading over
the entire frequency range. One may also get a pattern of symmetrical smaller
ripples to either side of the main peak. These are pretty easy to spot when
looking at a FFT graph.

If one has background noise, then the smaller artefacts and the
tails of the peaks merge into it.

One can test the results by including an option to reconstruct a wave sound from the
partials. FTS has this (borrowed idea from WAVANAL). Then one can play that
and see if it sounds like the original timbre.

In practice it seems to work well with timbres with well defined partials that
one can hear; finding them all, and no spurious extra ones (apart from the
FFT artefacts already mentioned), and with reasonable accuracy.

I hope perhaps this may be useful information.

Robert

🔗Robert Walker <robertwalker@...>

8/28/2001 5:28:17 AM

Hi Jeff,

I've tried Quinn's first and second estimators, which uses the phase information to
improve frequency accuracy - those are two of the five interpolation
methods that are in the list. Jain's method (which doesn't use the phase
information) seems to work better than either, at least for the time sample
lengths and frequencies I used (frequencies of 440 Hz +- a few octaves,
and time samples in region 0.1 secs to, say, 30 secs or so).

I realise, that I gave incorrect info about real FFT. It doesn't discard the
imaginary part of the data. What it does is to pack both real and imaginary
data into the real buffer, which one later unpacks. So you still have the phase info.

Sorry about the misdirection. It is a little while ago that I coded it,
maybe a month ago? I actually coded the unpacking of real and imaginary
parts for the real FFT, but had forgotten about all that (nice thing
about programming - even when you foget how you did it, the program
remembers!)

Real FFT here refers to the input - that the data is given in real form,
with no imaginary part to it - the sample points of the wave are real rather
than complex. It is just a single wave with no phase information added to it.

A complex FFT needs both real and imaginary parts to be specified,
and one sets the imaginary part to 0. A real FFT only
needs the real part as data, so is faster. It then returns the real
and imaginary parts in the buffer, packed so that the imaginary parts
are in the high end of the buffer, which loses no information for
a real FFT as one can use symmetry to get the missing data, which
is beyond the Nyquist frequency.

Description of the peak interpolation algorithms:
http://www.dspguru.com/howto/tech/peakfft2.htm
I used this as my reference for good ones to try out.

For real FFT I use Ooura's free c-code from:
http://www-dsp.rice.edu/research/fft/fftnote.asc
http://momonga.t.u-tokyo.ac.jp/~ooura/fft.html

which is about as fast as it gets. FFTW is supposed to be a little faster but in practice
I find no noticeable difference if one uses powers of two for the buffer sizes. If
one uses other buffer sizes, then FFTW ("Fastest Fourier Transform in the West")
is the one to go for. But, issued under GPL.

May be okay to supply FFTW in a dll which users can download separately as a kind of
plug in for the program. I'm not entirely sure about this, but I'd have thought
it would be okay.

There's free source code available for a FFTW dll for Windows, which I have
compiled and used with the FFTW preview beta. However the tiny speed improvement
doesn't seem to be worth it, not even noticeable at all, unless one needs the
non powers of two. I may add it in as an option anyway if the plug in idea
is acceptable.

For the complex FFT I still actually use Graham Breed's code which he coded
from description of the algorithm in a book on FFT. In practice I find it
is about as fast as FFTW for complex transforms, if one adds a look up sine
table to it!

It's also a very short algorithm, - about 100 lines including the look up table
- while Ooura's real FFT, which is perhaps twice as fast, is getting on for
2000 lines of code. Much of that speed improvement is prob. due to it being
a real FFT - which one can also do by calling a complex FFT and post / pre
processing the arguments with a few lines of code (though I haven't quite
figured out how one does that, and don't need to having a very fast real
FFT to hand anyway).

So, I think this is a field where a lot of research is going into small tweaks
to speed up the algorithm slightly, except of course for the case of non powers
of two, where FFTW rules as far as I gather.

> Also, there are ways of getting around (eliminating)
> the trade off between frequency resolution and time
> resolution. My stating that such methods exist should
> be sufficient to trigger an 'aha!' in the sufficiently
> clever reader. (Right, Robert? It's easy once you
> realize that it -can- be gotten around easily. I'm
> betting you'll be able to guess this one fairly
> easily.)

I can't imagine what you mean by this. However, I remember you saying you
have some neat ideas in the area of FFT.

Robert

🔗Robert Walker <robertwalker@...>

8/29/2001 8:03:15 PM

Hi Brian,

The QM analogy for FFT is an interesting one, and I think I've
seen it used before for FFT somewhere. However, I think it is not exact.
It's an interesting topic to discuss perhaps.

- in QM you can't simultaneously find position and momentum,
but you can find each individually to whatever accuracy you
like, given enough energy for the measurements.

So far, that's analagous to frequency / time in FFT.

However, in QM (and here I'm just relying on popular accounts of
QM I must say), the data is lost to future measurements,
according to the uncertainty principle, because by making
the measurement you have changed the thing you are measuring.
Also all possible measurements methods have to change the data,
according to the theory (if correct of course!). So once you
meauser the momentum to the best accuracy you can, then it is
no longer possible to find the position, because, e.g., the
electron has moved to a new unpredictable position as a
result of your determination of its momentum to great accuracy.

Determine its new position to great accuracy, and you still
can't backtrack and find where it came from. E.g. perhaps
you found the momentum by bouncing it off something,
perhaps another particle in a particle accelerator,
but the thing you bounced it off has moved, and both
moved in unpredictable ways, so even though you knew
its momentum to tremendous accuracy at that point,
you had minimal knowledge of where it was (just enough
to know that it was within the region of the measureing
apparatus perhaps).

In FFT, the data is still there whatever you do, and so
even if there are similarities of one kind or another,
it can't be an exactly analagous situation, or so it seems
to me.

So, is the uncertainty in the FFT a result of the method
one is using, or is it in some way intrinsic in the
data, and if so, to what extent?

If you have a simple sine wave, with rapidly varying
frequency, you can track it pretty well by wave counting,
and do much better that way than with FFT. One can
get very accurate results by suitable interpolation
of the crossing points. this is something I've tried and
it works.

I think that if one knew that the data one is tracking actually _is_ a sine
wave varying in frequency - one could do even better by looking for the
single best fit sine wave for any sequence of data points.
Suppose one also knows that it is unvarying in amplitude
- one could imagine a real application in which both
of those things were true. Then I rather expect one could do very
well indeed with wave counting.

This shows that hte uncertainty of frequency isn't something
that has to show up in all possible measruements of the
data (as in QM theory, where no experiment, however well
conceived, will work, according to the theory), but something specific to the
particular way one is matching a measurement technique
with the data and ones assumptions about the data.

There will always have to be some assumptions about the data
as it is finite and anything could be happening between the
sample points. In principle (though with very low probablility)
one might get a perfect wave form from a sampling of
pure noise. But in practice, with very high probability,
near certainty, if the data appears continuous, then
it is, with maybe tiny ripples superimposed too small
to see with the data set.

So that leaves open the possiblity of improving frequency + time
accuracy, either by more refined FFT methods (such as
peak interpolation) or by other techniques (wavelets,
or wave counting). All this dependent on what the
particular data is, and what kinds of assumptions are
reasonable to make about it.

This is something of practical relevance to me - in the
wave counting section of FTS I'm trying to do exactly this
- to improve both pitch and time resolution simultaneously
for a single melody line. If the timbre is suitable, it
seems one can improve on FFT by a fair amount, as my
bird songs page demonstrates - this would be impossible
to do with FFT I venture to suggest, because of the
rapid changes in pitch (but very happy to be proved wrong).

I've posted the birdsong url here before, but here is
the new location for it:
http://members.tripod.com/~robertinventor/tunes/birdsong.htm

Robert