back to list

Pitch perception.

🔗Robert Walker <robertwalker@ntlworld.com>

3/19/2002 9:47:33 PM

HI Paul,

Sorry, I haven't been following the discussion, so
I'm not sure what the issue is - I've read a few
posts but not the first ones in the thread yet
and haven't quite gathered yet what it is all
about.

Listening to the clips, is it something to do
with effect of panning on pitch perception or
something?

Anyway here is FTS's analysis of the first 0.3715 seconds
of Jerry 0, using Jain's method for peak interpolation,
which I usually find best (also using real fft, so
Quinns estimator isn't suitable).

.............................

Fractal Tune Smithy Custom voice or FFT results As Timbre Partials

Freq. Amp. (db) Cents

233.242601 100 0
292.197079 99.2702449 390.132525
351.139276 99.0889114 708.254469
467.310756 85.5578715 1203.06111
585.253315 85.3716799 1592.67586
701.41652 86.8109643 1906.12811
876.593248 78.6813572 2292.0915
935.480806 72.6118721 2404.65209
1053.43651 78.3680915 2610.24017
1169.59846 72.7356393 2791.33211
1403.73768 68.0307111 3107.24416
1462.63188 69.4254937 3178.39619
1637.75332 62.8805013 3374.17782
1753.99971 73.562052 3492.89435
1871.90172 60.730724 3605.52177
2047.01193 62.2001105 3760.33955
2105.98465 64.5297518 3809.51004
2340.06198 55.8697218 3991.97222
2456.29412 62.903345 4075.89612
2574.21756 55.8142685 4157.07692
2631.40273 58.9146085 4195.11461
2808.29368 64.0485541 4307.74871
2924.47251 55.6977592 4377.92785
3042.46863 52.7397099 4446.407
3158.54685 58.4133233 4511.2293
3217.53569 55.469668 4543.26351
3274.78908 51.1643764 4573.79852
3508.86086 61.8616138 4693.31943
3742.95658 48.3273968 4805.13011
3801.93867 51.5139002 4832.19845
3860.89601 55.4886145 4858.83897
3976.98951 46.6291337 4910.12824
4094.96264 51.8234404 4960.73648
4211.11474 52.1818226 5009.15876
4386.339 50.5331664 5079.73691
4445.10389 45.741937 5102.7767
4561.37661 52.7287489 5147.47929
4679.30811 51.7988071 5191.6704
4913.47164 45.8412085 5276.20741
4970.64312 48.5584463 5296.23517
5147.45165 44.3886571 5356.74618
5263.74886 48.7179563 5395.42493
5381.628 43.3322372 5433.76734
5556.67422 45.8442103 5489.18213
5615.68575 50.5214226 5507.47078
5965.96392 46.9228025 5612.22244
6141.08653 43.148561 5662.30885
6316.31025 50.1293339 5711.01458
6434.24164 42.6600785 5743.04028
6668.28048 46.4428755 5804.89388
6725.49531 42.2825854 5819.68476
7368.79221 44.0819247 5977.82994
8071.14406 42.9542189 6135.44396
8189.03007 44.0457789 6160.54724
9123.79424 43.557167 6347.67683

Analysed on 5 AM Wednesday, March 20, 2002 GMT Standard Time - Recorded Sound

FFT sample analysed: 0.2813 Mb 0.3715 secs (16384 samples)
Recorded Sound
From recording of length: 2.268 secs

Truncated from 0.5729 secs to make the number of samples a power of two

.............................

The bin size is +- 2.61965 Hz (I should really put that in
the text file of the results, so I'll prob. do that
for the future - I'm planning to do a few
days on this section, which is the last part
of FTS 1.09 to need to be finished)

Typically then I'd expect the values to be accurate
to about +-0.2 Hz or so.

However, I'm not sure how we can test this.

I can do a shorter one, say, less than 0.1 seconds,
and I get:

...............................

Fractal Tune Smithy Custom voice or FFT results As Timbre Partials

Freq. Amp. (db) Cents

232.208035 99.7771836 0
295.026805 100 414.513772
350.216559 99.3201121 711.395289
468.261997 85.0275311 1214.27768
586.219414 85.6398884 1603.22742
704.222173 87.2792605 1920.73531
877.430111 77.6245585 2301.43959
932.647874 73.6974587 2407.09751
1050.65198 78.5802376 2613.35408
1168.68821 73.4921955 2797.68035
1404.59108 68.8106059 3115.99245
1459.83599 69.5783582 3182.7798
1640.69516 63.7878585 3384.9809
1751.07121 73.7779458 3497.69755
1869.12539 61.5128263 3610.64826
2049.92574 63.7346581 3770.49822
2105.0584 64.201604 3816.44455
2341.13647 55.7295194 4000.46308
2459.1773 63.6641118 4085.62315
2577.13228 55.9225698 4166.73215
2632.38252 58.299191 4203.45522
2805.51831 63.9440134 4313.73303
2923.46162 56.8995342 4385.02543
3041.70136 52.0969011 4453.66646
3159.61591 58.8079382 4519.51127
3214.84628 55.910612 4549.51195
3277.70881 51.3338891 4583.03747
3505.93417 62.4252491 4699.57093
3742.10537 49.2268664 4812.43246
3804.66296 53.197709 4841.13463
3860.02039 55.0185116 4866.14241
3978.23079 47.3222101 4918.36461
4096.12731 51.1844116 4968.92491
4213.86785 53.623411 5017.98633
4387.03535 49.9027878 5087.70784
4442.39597 46.4828129 5109.41783
4560.39017 52.6392501 5154.80095
4678.53504 52.1235471 5199.08047
4914.22714 46.8218097 5284.16969
4969.4149 47.8718956 5303.50344
5150.57096 43.8891779 5365.49109
5260.83233 48.9118806 5402.16153
5378.90318 43.0469867 5440.58666
5559.86745 46.1484191 5497.87283
5614.8456 49.3015788 5514.90786
5969.04068 48.1905725 5620.81114
6142.31627 43.5112659 5670.3516
6315.31414 50.2955693 5718.43764
6433.02143 42.5556593 5750.40804
6669.42269 45.8666005 5812.8865
6724.78105 43.243585 5827.197
7015.70637 43.24454 5900.51828
7369.72661 43.2869792 5985.74556
8188.31793 43.2739192 6168.09279

Analysed on 5 AM Wednesday, March 20, 2002 GMT Standard Time - Recorded Sound

FFT sample analysed: 0.07033 Mb 0.09288 secs (4096 samples)
Recorded Sound
From recording of length: 2.268 secs

Truncated from 0.1 secs to make the number of samples a power of two

...............................

Bin size +- 10.76666 hz, so one would expect it to be
accurate to within a 1 Hz or so after peak interpolation,
and comparing it with the 0.3 second one,
one can see that it is close enough to it most of the
time - many values within 1 Hz. The partial at 295.026805 hz is
about 3 hz out, assuming the 0.3 second analysis gives
the better result.

It's pretty easy to use - why not do it yourself?

Views | Analyse Recording or midi Voice

Open Audio file

Select region of the audio file with click and drag

In main window select Analyse detail.

Then Find FFT, then Show FFT, and there you are
and you can choose the interpolation method
from Show FFT | Config. | FFT Peak detection.

Here also you can choose to show the peak detection
curves, and change the various parameters to detect more
or less peaks.

For this one, with really clear peaks, some of them
rather small, I recommend setting

Height of peak from valley more than
[0.00625] times max peak / average sound level

- or some lower value than 0.00625 - just
try varying the parameters and clicking
re-find peaks until one finds that
they all get found.

Also you could set valley less than [100] % of peak
which switches that option off -

Both of these are things that reduce the tendency to
detect noise or inharmonic components of a musical
note as extra partials

It just shows the FFT of the entire waveform
- won't try to separate out component notes
of a chord or anything, though one might be able
to figure them out from the partials.

My eventual aim is a method for good transcriptions
of monophonic melody lines to pitches to high
accuracy as far as pitch is concerned, and the
link in with FTS is that then one could sing or
play a seed, get FTS to convert it to midi,
and then use it for the fractal tunes using
the original pitches exactly as you played or
sang it. Or using the closest notes to those pitches
in the current scale, depending what one wants.

Robert

🔗LAFERRIERE François <francois.laferriere@cegetel.fr>

3/20/2002 7:57:27 AM

Hi Robert!

In the mail below, you say that you assume such or such precision for the
peak detection and look for a verification method. I may suggest something.
It is more than reasonable to assume that each of the three tones are
harmonic. So, the list should contain integer multiples of 233.242601,
292.197079 and 351.139276 (or whatever are the actual true value). For each
frequency value in the list, get the best candidate for the fundamental
among the three values, then compute the percentage of "anharmonicity" by
seeing how far away it is from the expected harmonic. Eventually convert the
difference in cents and check out the actual dispersion of the results. That
should give an idea of teh reliability of the method.

keep me inform of what you found

yours truly

François Laferrière

> -----Message d'origine-----
> De: Robert Walker [mailto:robertwalker@ntlworld.com]
> Date: mercredi 20 mars 2002 06:48
> À: tuning@yahoogroups.com
> Objet: [tuning] Pitch perception.
>
>
> HI Paul,
>
> .............................
>
> Fractal Tune Smithy Custom voice or FFT results As Timbre Partials
>
> Freq. Amp. (db) Cents
>
> 233.242601 100 0
> 292.197079 99.2702449 390.132525
> 351.139276 99.0889114 708.254469
> 467.310756 85.5578715 1203.06111
> 585.253315 85.3716799 1592.67586
> 701.41652 86.8109643 1906.12811
> 876.593248 78.6813572 2292.0915
> 935.480806 72.6118721 2404.65209
.....

> Analysed on 5 AM Wednesday, March 20, 2002 GMT Standard Time
> - Recorded Sound
>
> FFT sample analysed: 0.2813 Mb 0.3715 secs (16384 samples)
> Recorded Sound
> From recording of length: 2.268 secs
>
> Truncated from 0.5729 secs to make the number of samples a
> power of two
>
> .............................
>
> The bin size is +- 2.61965 Hz (I should really put that in
> the text file of the results, so I'll prob. do that
> for the future - I'm planning to do a few
> days on this section, which is the last part
> of FTS 1.09 to need to be finished)
>
> Typically then I'd expect the values to be accurate
> to about +-0.2 Hz or so.
>
> However, I'm not sure how we can test this.
>
.................................

🔗paulerlich <paul@stretch-music.com>

3/20/2002 1:09:04 PM

--- In tuning@y..., "Robert Walker" <robertwalker@n...> wrote:
> HI Paul,
>
> Sorry, I haven't been following the discussion, so
> I'm not sure what the issue is - I've read a few
> posts but not the first ones in the thread yet
> and haven't quite gathered yet what it is all
> about.
>
> Listening to the clips, is it something to do
> with effect of panning on pitch perception or
> something?

don't be concerned about that for now.
>
> Anyway here is FTS's analysis of the first 0.3715 seconds
> of Jerry 0,

would you mind doing all the jerries?

using Jain's method for peak interpolation,
> which I usually find best (also using real fft, so
> Quinns estimator isn't suitable).
>
> .............................
>
> Fractal Tune Smithy Custom voice or FFT results As Timbre Partials
>
> Freq. Amp. (db) Cents
>
> 233.242601 100 0
> 292.197079 99.2702449 390.132525
> 351.139276 99.0889114 708.254469
> 467.310756 85.5578715 1203.06111
> 585.253315 85.3716799 1592.67586
> 701.41652 86.8109643 1906.12811

well, these results are obviously very poor, even for jerry0, which
is the easiest one (a ji major triad). what if you used a longer
sample -- as long as you can?