back to list

Re: [MMM] Improvisation on pitches from a recording of a song

🔗Robert Walker <robertwalker@ntlworld.com>

4/14/2005 2:42:39 AM

Hi Carl,

> This sort of thing could be useful not only for bird song, but for
> extracting scales from traditional music!

Yes. It can do it already if the music is on a wind
instrument of suitable timbre such as a flute,
or something which is very amenable to the technique.

> May I ask, how do you cope with note transitions? How do you decide
> what discrete pitches to choose? Surely the thrush does not sing in
> discrete pitches.

Yes you are right, that's maybe not so very reaalistic of it
to make it into a scale, though it is interesting as a kind
of found scale coming out of the interaction between the
program and the bird song.

It takes the average pitch during a glissandi and when it
slides far enough makes a new pitch. You can play it with
pitch glides between them all I suppose strictly speaking
it should try and transcribe it using continuously varying
pitches.

I just tried slowing down the thrushes song and it does have
many large pitch glissandi. Some other birds are a bit more discrete
in the pitches they sing.

> My suggestion for extracting scales from melodic performances is as
> follows. Normal audio signals plot energy over time. After continuous
> pitch extraction, we have pitch over time. If we apply a time-domain
> transform to *that*, we have 'pitch frequencies' -- the amount of time
> the performer spent playing each pitch in the entire performance (the
> duration of musical tones is typically great compared to the period of
> musical pitches, so I think we're OK here). We can now slice the pitch
> continuum into octaves, and in each octave hopefully see a manageable
> number of Gaussian lumps (if not, we might say there is no discrete
> scale abstraction used in the performance). When we superimpose these
> octave slices, a majority of lumps hopefully coincide. Finally, the
> local maxima of this summed function are written to a Scala file.

Does this make sense? Perhaps we should take this to the tuning
list...

Yes could do that. FTS extracts the notes as a sequence
of volume, time and frequency triples. So one could analyse
those however one wants to try and extract the scale.

At present it just analyses them by using a tolerance
factor. It orders all the pitches in ascending order,
using an interval of equivalence set by the user,
or it can just use the highest pitch found as the interval
of equivalence if that is appropriate.

Then it walks up the scale and bundles the pitches
using the tolerance limit as it finds them.
So you set the tolerance limit high enough to allow
for variability in pitch but low enough to distinguish
pitches that should be distinguished so will depend on
the context. Or one could just use all the pitches
found in ascending order by making the tolerance
level 0.

I don't do anything about weighting by the duration of the pitches, but one
could do that too, especially since longer duration pitches
are perhaps likely to be pitched more exactly, or heard
with more exactness of pitch too, I mean a grace note
of just a hundredth of a second for instance may not
be so exactly pitched as a one second note for instance.
So maybe one could take that into account. I could
add an option to FTS to print out all the pitches found
as pitch / volume / duration triples for anyone to analyse
as they please using their own software too.
Well it is visible in the interface but
as three rows of numbers - you could just
use those too copy / paste and use them
as input to ones program.

I've cc'd this reply to the main tuning list.

I'll take a look at the next few digests
in case there are any replies on this
particular thread. It's relevant to what I'm working
on in the programming right now so for me it's
"work" :-). But with the high volume of posts
there I'll probably miss some replies if it
gets followed up much - that doesn't mean I'm
uninterested, just working hard that's
all right now on the FTS 3.0 release.

Thanks,

Robert

----- Original Message ----- From: <MakeMicroMusic@yahoogroups.com>
To: <MakeMicroMusic@yahoogroups.com>
Sent: Wednesday, April 13, 2005 10:20 PM
Subject: [MMM] Digest Number 1187

>
>
> There are 3 messages in this issue.
>
> Topics in this digest:
>
> 1. Re: Rhapsody for Dan Stearns (mp3)
> From: Margo Schulter <mschulter@calweb.com>
> 2. Improvisation on pitches from a recording of a song thrush
> From: "Robert Walker" <robertwalker@ntlworld.com>
> 3. Re: Improvisation on pitches from a recording of a song thrush
> From: Carl Lumma <ekin@lumma.org>
>
>
> ________________________________________________________________________
> ________________________________________________________________________
>
> Message: 1
> Date: Wed, 13 Apr 2005 00:37:55 -0700 (PDT)
> From: Margo Schulter <mschulter@calweb.com>
> Subject: Re: Rhapsody for Dan Stearns (mp3)
>
>
>
> On Tue, 12 Apr 2005 Aaron Krister Johnson wrote:
>
>> I'm behind on my posting, but I did listen--I enjoyed this....is this in >> your
>> 2/7 comma 'temperament extraordinaire'?
>>
>> All best,
>> Aaron.
>>
>> > <http://www.bestii.com/~mschulter/RhapsodyForDanStearns1.mp3>
>
> Hello there, Aaron and all.
>
> Actually I'm just catching up myself, since I've just downloaded some
> of your pieces on your newly formatted page
>
> <http://www.akjmusic.com/works.html>
>
> For now I'll just say that your a cappella renditon of _Most Blessed
> of Mornings_ in a Neidhardt temperament is beautiful; the _I Dream of
> Tibet_ piece in 7-tET is most wonderful; and _Melancholic_ is for me a
> fascinating mixture or medley of styles, and from a theoretical
> viewpoint one might ask how much of this is specific is to the 53-EDO
> tuning, and how much to some other creative elements -- but a
> consummate piece of music.
>
> There's also some Prent Rodgers and Dan Stearns I'm been catching up
> on, another reason for me to post again soon.
>
> It's curious that you mention this tuning (the temperament extraordinaire
> based on Zarlino's 2/7-comma meantone), since it's what I've mostly
> been playing in lately, and indeed I'm trying to get some pieces ready
> for recording.
>
> This piece, however, is in something I did in part as a change of pace
> -- 11-tET (or 11-EDO, as people prefer), based on a main scale of a
> kind which I came upon in the Spring of 2002 and which it turned out Dan
> Stearns had earlier found. This basic type of pattern is:
>
> 0 2 4 6 8 9 11
> 0 218 436 655 873 982 1200
>
> Part of the fun was finding synthesizer timbres so that the 6-step
> interval (about 654.55 cents) would serve musically as a "fifth." The
> fact that it's considerably narrow of 3:2 makes possible a special
> effect: when the excellent 11-tET major third (just wide of a pure
> 9:7) expands to a fifth, both voices move by a semitone of 1 step
> (about 109.09 cents), a progression I use to open the piece. This
> progression calls for some steps outside the above basic hexatonic
> scale, here:
>
> 5 6
> 1 0
>
> However, the final cadence formula is available within the basic
> scale:
>
> 9 11
> 8 6
> 2 0
>
> This might be approximately described as a xentonal variation on a
> 13th-century European formula where an outer minor sixth (here very
> close to a just 14:9) expands to an octave and an upper minor second
> to a fourth. Both the original and this 11-EDO variation are very
> moving and and powerful and beautiful, each in its own way.
>
> Anyway, sooner or later I wanted to document this tuning/timbre, but
> the desire for a "change of pace" gave me an immediate incentive.
>
> Again, thank you both for your response to this piece, and for your
> beautiful music.
>
> Peace and love,
>
> Margo
>
>
> ________________________________________________________________________
> ________________________________________________________________________
>
> Message: 2
> Date: Wed, 13 Apr 2005 20:08:21 +0100
> From: "Robert Walker" <robertwalker@ntlworld.com>
> Subject: Improvisation on pitches from a recording of a song thrush
>
> Hi there,
>
> http://www.robertinventor.com/improvisation_on_thrush_song_pitches.mp3
>
> [4 Mb]
>
> Robert
>
>
> ________________________________________________________________________
> ________________________________________________________________________
>
> Message: 3
> Date: Wed, 13 Apr 2005 13:34:20 -0700
> From: Carl Lumma <ekin@lumma.org>
> Subject: Re: Improvisation on pitches from a recording of a song thrush
>
>>Hi there,
>>
>>http://www.robertinventor.com/improvisation_on_thrush_song_pitches.mp3
>>
>>[4 Mb]
>>
>>Robert
>
> Fascinating! How'd you do it? -Carl
>
>
>
> ________________________________________________________________________
> ________________________________________________________________________
>
>
>
> ------------------------------------------------------------------------
> Yahoo! Groups Links
>
>
>
>
> ------------------------------------------------------------------------
>
>
>
>

🔗Robert Walker <robertwalker@ntlworld.com>

4/14/2005 2:53:14 AM

Hi Carl,

BTW one idea I had for analysing bird song
is to try and fit a best fit sine wave to it.

I just did a search, and there are methods for finding best fit
sine waves though it is a bit tricky. When you have several
sine waves superimposed to find it must be even harder.
But maybe still better than using FFT for the case where there
are relatively few partials, or the partials are harmonics
so that they all are multiples of the same period.
I don't know though how well it would cope with the irregular
low frequency additional signals you get with bird song such as the one here:
http://www.robertinventor.com/robin.png

Anyway that is more for the future, won't do it right now
unless it turns out to be very easy to do.

Sending this just to the tuning list as anyone at MMM would
know to follow the thread here, and this is decidedly
mathematical waters, or would be to follow it through
properly. Maybe tuning-math if anyone happens to know
anything about it and wants to reply in a mathematical way.

Robert

🔗Carl Lumma <ekin@lumma.org>

4/14/2005 3:03:25 AM

[This thread is from MMM.]

>> My suggestion for extracting scales from melodic performances is as
>> follows. Normal audio signals plot energy over time. After continuous
>> pitch extraction, we have pitch over time. If we apply a time-domain
>> transform to *that*, we have 'pitch frequencies' -- the amount of time
>> the performer spent playing each pitch in the entire performance (the
>> duration of musical tones is typically great compared to the period of
>> musical pitches, so I think we're OK here). We can now slice the pitch
>> continuum into octaves, and in each octave hopefully see a manageable
>> number of Gaussian lumps (if not, we might say there is no discrete
>> scale abstraction used in the performance). When we superimpose these
>> octave slices, a majority of lumps hopefully coincide. Finally, the
>> local maxima of this summed function are written to a Scala file.
>
>Yes could do that. FTS extracts the notes as a sequence
>of volume, time and frequency triples. So one could analyse
>those however one wants to try and extract the scale.

Hi Robert,

It never gives two simultaneous pitches, does it? If it doesn't,
the volume part of the triple is not needed in technique.

>I don't do anything about weighting by the duration of the pitches,
>but one could do that too, especially since longer duration pitches
>are perhaps likely to be pitched more exactly, or heard with more
>exactness of pitch too, I mean a grace note of just a hundredth of
>a second for instance may not be so exactly pitched as a one second
>note for instance.

Indeed. And in fact, portamenti, glissandi, and even legato note
transitions should not contribute to the scale analysis in my view.

Plus, duration-based analysis could even tell us something about
melodies played on keyboard instruments. There, it isn't needed to
distinguish note transitions from scale tones, but it could tell us
something about central vs. passing/auxillary tones -- even
"diatonic" melodies rarely use all 7 notes equally.

>So maybe one could take that into account. I could add an option to
>FTS to print out all the pitches found as pitch / volume / duration
>triples for anyone to analyse as they please using their own software
>too.

That would be great!

>Well it is visible in the interface but as three rows of
>numbers - you could just use those too copy / paste and use them
>as input to ones program.

It does this now? What's the download url?

-Carl

🔗Graham Breed <gbreed@gmail.com>

4/14/2005 3:15:08 AM

Robert Walker wrote:

> BTW one idea I had for analysing bird song
> is to try and fit a best fit sine wave to it.

Have you looked at Csound? Chapter 21 of "The Csound Book" describes how to make a pitch to MIDI convertor. I haven't tried it out, but it looks like a lot of thought has gone into the pitch tracking utilities. I think the source code is LGPL, so you can use it, but with restrictions.

It's a few weeks since I read it, but I think they do best fits for the full harmonic series.

Graham

🔗Ozan Yarman <ozanyarman@superonline.com>

4/14/2005 5:15:46 AM

Solo Explorer 1.0 from www.recognisoft.com has a modest pitch to midi
conversion feature for monophonic wave files.

Cordially,
Ozan

----- Original Message -----
From: "Graham Breed" <gbreed@gmail.com>
To: <tuning@yahoogroups.com>
Sent: 14 Nisan 2005 Per�embe 13:15
Subject: Re: [tuning] Re: [MMM] Improvisation on pitches from a recording of
a song

>
> Robert Walker wrote:
>
> > BTW one idea I had for analysing bird song
> > is to try and fit a best fit sine wave to it.
>
> Have you looked at Csound? Chapter 21 of "The Csound Book" describes
> how to make a pitch to MIDI convertor. I haven't tried it out, but it
> looks like a lot of thought has gone into the pitch tracking utilities.
> I think the source code is LGPL, so you can use it, but with
restrictions.
>
> It's a few weeks since I read it, but I think they do best fits for the
> full harmonic series.
>
>
> Graham

🔗Carl Lumma <ekin@lumma.org>

4/14/2005 12:19:16 PM

> Solo Explorer 1.0 from www.recognisoft.com has a modest pitch
> to midi conversion feature for monophonic wave files.
>
> Cordially,
> Ozan

Hi Ozan,

Funny enough, I was looking through my correspondence with
Can Akkoc last night when I found that link. When you then
mentioned it, I downloaded their demo. Unfortunately, the
demo expired April 1st! I've sent a support request.

...well, this is a pain. I've tried gmail, yahoo mail, and
lumma.org mail, and their server keeps returning my mail as
undeliverable because of a "Non-encoded 8-bit data (char 82 hex)
in message header 'X-Source'". Hmm.

-Carl

🔗Ozan Yarman <ozanyarman@superonline.com>

4/14/2005 2:01:07 PM

Carl, you should just as well pay for it. I have done so and can tell you that it's pretty accurate and easy to use. Gaulius Raskinis has done a fine job there.

Cordially,
Ozan
----- Original Message -----
From: Carl Lumma
To: tuning@yahoogroups.com
Sent: 14 Nisan 2005 Perşembe 22:19
Subject: [tuning] Re: [MMM] Improvisation on pitches from a recording of a song

> Solo Explorer 1.0 from www.recognisoft.com has a modest pitch
> to midi conversion feature for monophonic wave files.
>
> Cordially,
> Ozan

Hi Ozan,

Funny enough, I was looking through my correspondence with
Can Akkoc last night when I found that link. When you then
mentioned it, I downloaded their demo. Unfortunately, the
demo expired April 1st! I've sent a support request.

...well, this is a pain. I've tried gmail, yahoo mail, and
lumma.org mail, and their server keeps returning my mail as
undeliverable because of a "Non-encoded 8-bit data (char 82 hex)
in message header 'X-Source'". Hmm.

-Carl

🔗jon wild <wild@fas.harvard.edu>

4/14/2005 3:12:27 PM

Dear Robert and others,

you might enjoy composer David Jaffe's work on transcribing birdsong for
electronics. His piece "Imaginary Animals" is fantastic - it uses a
transformed transcription of the song of a Winter Wren. You can read about
it here (scroll down past the section on birdsong in instrumental music):

http://www.jaffe.com/IA.html

Try to find a recording of Imaginary Animals - it's a terrific piece.

--Jon

🔗Carl Lumma <ekin@lumma.org>

4/14/2005 3:21:14 PM

>>> Solo Explorer 1.0 from www.recognisoft.com has a modest
>>> pitch to midi conversion feature for monophonic wave files.
>>
>> Hi Ozan,
>>
>> Funny enough, I was looking through my correspondence with
>> Can Akkoc last night when I found that link. When you then
>> mentioned it, I downloaded their demo. Unfortunately, the
>> demo expired April 1st! I've sent a support request.
>>
>> ...well, this is a pain. I've tried gmail, yahoo mail, and
>> lumma.org mail, and their server keeps returning my mail as
>> undeliverable because of a "Non-encoded 8-bit data (char 82
>> hex) in message header 'X-Source'". Hmm.
>
> Carl, you should just as well pay for it. I have done so and
> can tell you that it's pretty accurate and easy to use.
> Gaulius Raskinis has done a fine job there.

I expect I would like to buy it, but an expired demo and
unreachable e-mail address do not encourage me to do so.

-Carl

🔗Ozan Yarman <ozanyarman@superonline.com>

4/14/2005 4:12:09 PM

You may have a point there. Best to get in touch with the Raskinis fellow. Or better yet, I can send you a copy of the demo-ware by e-mail if you like.

Cordially,
Ozan
----- Original Message -----
From: Carl Lumma
To: tuning@yahoogroups.com
Sent: 15 Nisan 2005 Cuma 1:21
Subject: [tuning] Re: [MMM] Improvisation on pitches from a recording of a song

>>> Solo Explorer 1.0 from www.recognisoft.com has a modest
>>> pitch to midi conversion feature for monophonic wave files.
>>
>> Hi Ozan,
>>
>> Funny enough, I was looking through my correspondence with
>> Can Akkoc last night when I found that link. When you then
>> mentioned it, I downloaded their demo. Unfortunately, the
>> demo expired April 1st! I've sent a support request.
>>
>> ...well, this is a pain. I've tried gmail, yahoo mail, and
>> lumma.org mail, and their server keeps returning my mail as
>> undeliverable because of a "Non-encoded 8-bit data (char 82
>> hex) in message header 'X-Source'". Hmm.
>
> Carl, you should just as well pay for it. I have done so and
> can tell you that it's pretty accurate and easy to use.
> Gaulius Raskinis has done a fine job there.

I expect I would like to buy it, but an expired demo and
unreachable e-mail address do not encourage me to do so.

-Carl

You can configure your subscription by sending an empty email to one
of these addresses (from the address at which you receive the list):
tuning-subscribe@yahoogroups.com - join the tuning group.
tuning-unsubscribe@yahoogroups.com - leave the group.
tuning-nomail@yahoogroups.com - turn off mail from the group.
tuning-digest@yahoogroups.com - set group to send daily digests.
tuning-normal@yahoogroups.com - set group to send individual emails.
tuning-help@yahoogroups.com - receive general help information.

------------------------------------------------------------------------------
Yahoo! Groups Links

a.. To visit your group on the web, go to:
/tuning/

b.. To unsubscribe from this group, send an email to:
tuning-unsubscribe@yahoogroups.com

c.. Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.

🔗Carl Lumma <ekin@lumma.org>

4/14/2005 5:01:13 PM

>>>>> Solo Explorer 1.0 from www.recognisoft.com has a modest
>>>>> pitch to midi conversion feature for monophonic wave files.
>>>>
>>>> Hi Ozan,
>>>>
>>>> Funny enough, I was looking through my correspondence with
>>>> Can Akkoc last night when I found that link. When you then
>>>> mentioned it, I downloaded their demo. Unfortunately, the
>>>> demo expired April 1st! I've sent a support request.
>>>>
>>>> ...well, this is a pain. I've tried gmail, yahoo mail, and
>>>> lumma.org mail, and their server keeps returning my mail as
>>>> undeliverable because of a "Non-encoded 8-bit data (char 82
>>>> hex) in message header 'X-Source'". Hmm.
>>>
>>> Carl, you should just as well pay for it. I have done so and
>>> can tell you that it's pretty accurate and easy to use.
>>> Gaulius Raskinis has done a fine job there.
>>
>>I expect I would like to buy it, but an expired demo and
>>unreachable e-mail address do not encourage me to do so.
>
>You may have a point there. Best to get in touch with the Raskinis
>fellow. Or better yet, I can send you a copy of the demo-ware by
>e-mail if you like.

I'll bet you if you tried to install that demo it wouldn't work --
it looks like it has a kill switch dated April 1st. Also, I
apparently can't get in touch with Raskinis. Maybe you could
forward the following to him?

"""

Hello,

I downloaded...

http://www.recognisoft.com/bin/solexp10.exe

...but when I try to run it I the following error...

"The certificate has expired 4/1/2005 and the application
cannot start. Please check the Web for recent software updates."

Further, when trying to send this message to support@recognisoft.com,
my mail is bounced for the following reason...

"Non-encoded 8-bit data (char 82 hex) in message header 'X-Source'"

Thanks,

-Carl

"""

🔗Yahya Abdal-Aziz <yahya@melbpc.org.au>

4/14/2005 7:19:07 PM

Carl Lumma replied to Robert Walker:
_______________________________________________________________________
...

Plus, duration-based analysis could even tell us something about
melodies played on keyboard instruments. There, it isn't needed to
distinguish note transitions from scale tones, but it could tell us
something about central vs. passing/auxillary tones -- even
"diatonic" melodies rarely use all 7 notes equally.
...
________________________________________________________________________

It may be worth mentioning that some of the early
algorithmic composition software (in the 60s) had
parameters for controlling the duration spectrum,
based on analyses of several historically important
musical styles. So for example, you could produce a
stately processional piece, in several voices, in the
style of a (somewhat rule-bound) GF Handel, simply
by making certain scale degrees function mostly as
passing tones. Or in a more sophisticated version,
you could specify a matrix of transition probabilities
for (pitch, duration) pairs.

Regards,
Yahya

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.9 - Release Date: 13/4/05

🔗Carl Lumma <ekin@lumma.org>

4/14/2005 9:34:17 PM

>Carl Lumma replied to Robert Walker:
>_______________________________________________________________________
>...
>
>Plus, duration-based analysis could even tell us something about
>melodies played on keyboard instruments. There, it isn't needed to
>distinguish note transitions from scale tones, but it could tell us
>something about central vs. passing/auxillary tones -- even
>"diatonic" melodies rarely use all 7 notes equally.
>...
>________________________________________________________________________
>
>It may be worth mentioning that some of the early
>algorithmic composition software (in the 60s) had
>parameters for controlling the duration spectrum,
>based on analyses of several historically important
>musical styles. So for example, you could produce a
>stately processional piece, in several voices, in the
>style of a (somewhat rule-bound) GF Handel, simply
>by making certain scale degrees function mostly as
>passing tones. Or in a more sophisticated version,
>you could specify a matrix of transition probabilities
>for (pitch, duration) pairs.
>
>Regards,
>Yahya

Cool. Recall any references?

-Carl

🔗Yahya Abdal-Aziz <yahya@melbpc.org.au>

4/15/2005 8:17:31 PM

Carl,

Sorry, it WAS back then that I read about this stuff in
the Science library at Monash. :-) I was burning to try
something along those lines myself, but computer time on
the IBM 360 & 370 was VERY strictly rationed at the
time - especially for undergrads.

The code was (I think) FORTRAN IV, and the tone source
was - NIL! All that was output was a string of numbers
that you would then have to convert to notation by hand.

The analysis was pretty primitive; when I wrote "more
sophisticated", my tongue was very definitely in my cheek ...

It would make more sense today to try the same ideas
afresh with modern software on your own PC. If it doesn't
yet exist, maybe interested list members could get together
to write some?

Regards,
Yahya

Carl wrote:
________________________________________________________________________
Date: Thu, 14 Apr 2005 21:34:17 -0700
From: Carl Lumma <ekin@...>

>Carl Lumma replied to Robert Walker:
>_______________________________________________________________________
>...
>
>Plus, duration-based analysis could even tell us something about
>melodies played on keyboard instruments. There, it isn't needed to
>distinguish note transitions from scale tones, but it could tell us
>something about central vs. passing/auxillary tones -- even
>"diatonic" melodies rarely use all 7 notes equally.
>...
>________________________________________________________________________
>
>It may be worth mentioning that some of the early
>algorithmic composition software (in the 60s) had
>parameters for controlling the duration spectrum,
>based on analyses of several historically important
>musical styles. So for example, you could produce a
>stately processional piece, in several voices, in the
>style of a (somewhat rule-bound) GF Handel, simply
>by making certain scale degrees function mostly as
>passing tones. Or in a more sophisticated version,
>you could specify a matrix of transition probabilities
>for (pitch, duration) pairs.
>
>Regards,
>Yahya

Cool. Recall any references?

-Carl
________________________________________________________________________

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.13 - Release Date: 16/4/05

🔗Carl Lumma <ekin@lumma.org>

4/15/2005 8:56:39 PM

>Carl,
>
>Sorry, it WAS back then that I read about this stuff in
>the Science library at Monash. :-) I was burning to try
>something along those lines myself, but computer time on
>the IBM 360 & 370 was VERY strictly rationed at the
>time - especially for undergrads.
>
>The code was (I think) FORTRAN IV, and the tone source
>was - NIL! All that was output was a string of numbers
>that you would then have to convert to notation by hand.
>
>The analysis was pretty primitive; when I wrote "more
>sophisticated", my tongue was very definitely in my cheek ...
>
>It would make more sense today to try the same ideas
>afresh with modern software on your own PC. If it doesn't
>yet exist, maybe interested list members could get together
>to write some?

I wonder if Symbolic Composer does it?

http://www.symboliccomposer.com/

-Carl

🔗Robert Walker <robertwalker@ntlworld.com>

4/16/2005 3:11:53 AM

Hi Carl,

> It never gives two simultaneous pitches, does it? If it doesn't,
> the volume part of the triple is not needed in technique.

No, you are right, it's strictly monophonic. I do have an frequency
spectrum type pitch tracking
option as well which could find multiple pitches. That one isn't that good
at finding the starts and ends of notes. However what you can do with FTS
is to edit the results after it finds the notes - can other wave to midi
programs do this?

FTS shows blue vertical lines on the recording for all the note boundaries
it found - and you can click anywhere to add or remove them,
then get it to refind all the frequencies for the notes again,
using just the note boundaries you have marked out yourself.

If you do that then the frequency spectrum (FFT) version of the note finding
is pretty good at getting the exact pitches also,
because it uses peak interpolation methods to refine the estimate of
the pitch at the peak. With various tweaks, it is now roughly comparable
with the exact wave count in its precision though the exact wave count has a
slight edge I think. Both it seems can find pitch accurate to a few hundredths of a cent
or so for computer generated steady pitched notes of a second or so. The
wave counting routinely manages better than a hundredth of a cent for a one
second note if the waveform is suitable - e.g. computer generated exact
pitches of a regular waveform shapge.

So if you want exact pitches, and aren't too bothered about the exact rhythm,
it makes sesnse to analyse rich timbres in FTS using FFT with user interaction
to find the note boundaries.

So one could do that, and then it could find more than one simultaneous pitch
as it could then find all the partials for all those
notes if one wanted them.

At present it just tries to match a harmonic series
to them to find the note, and that part isn't particularly configurable yet, you
just have to try it as it is. But you can get it to show its analysis of a
particular note - and later I can have a go at refining it somewhat. I
could at any rate easily get it to output all the simultaneous pitches it finds
as it goes along the recording to make data for anyone who wants to analyse
them as they please.

But I don't have any plans at all to attempt polyphony, have made a design
decision in advance just not to attempt that. The task seems a hard one
- my hunch is that it is a sort of pattern recognition thing like
understanding speech or recognising faces which are both things that
are particularly hard to program. You need to reckon to devote all your programming
and research time to the task for some yeaars if you expect to get anywhere at
all in those areas I think.

The ear can hear a whole jumble of partials and instantly pick out that it
consists of say a basson, oboe and violin playing a chord, while a program
will have a much harder time of that because how would it tell which
partials belong with which instruments? Maybe by fitting harmonic series for
harmonic partials, but there may be some inharmonicity and the notes may be
very well also be played in near to harmonic series type harmonies with each
other. So, maybe really it would need a bank of templates of likely sounds
which it could then try and match against the recording, so you could tell
it that it uses such and such instruments, e.g. a guitar, or may have one in
it, and it could then look for the characterstic fingerprints of that
instrument.

I think if one were ever to be presented with an orchestra consisting of
entirely unfamiliar instruments, it's possible it might take one quite a
while to learn to recognise them - if there were no familiar ones in the orchestra
at all to get you started. Because the same partials could be divided up
in different ways. E.g. a flute and oboe in unison could just as easily
be a single "flute oboe" instrument say. Or e.g. the sound of a
car or a door creaking could be made up of a large number of instruments
playing quiet sounds, and I'm not sure, if one were unfamiliar with
doors, that one would know that that complex sound was a single thing rather
than the unison of several things played at once well synchronised.

So that's the situation that the computer program faces, no instruments
are familiar to it at all, unless you can figure out how to program it to
recognise them. It could listen to a bell and hear ten tuning forks
played at differnt pitches and volumes simultaneously.

I think when it comes to it once these polyphonic programs are a bit
more advanced they will surely have to build in experience of timbres
of real instruments into them in order to follow polyphonic lines
in complex musical textures. That's my hunch anyway.

Anyway it would be too much for me to attempt that as well, so yes, I'm
specialising on monophonic lines in FTS.

> >I don't do anything about weighting by the duration of the pitches,
> >but one could do that too, especially since longer duration pitches
> >are perhaps likely to be pitched more exactly, or heard with more
> >exactness of pitch too, I mean a grace note of just a hundredth of
> >a second for instance may not be so exactly pitched as a one second
> >note for instance.

> Indeed. And in fact, portamenti, glissandi, and even legato note
> transitions should not contribute to the scale analysis in my view.

Rightio. Another thing to bear in mind as well is that the
pitch detection is more accurate the longer the note is
so very short grace notes may not be recognised quite so
accurately by FTS. Depending that is on the quality of
the recording, the better the quality then the more easily
it will be able to detect shorter notes.

The bird song recordings I tested it with have all
been 8-bit just because that's what I found on the
web sites of bird song I tried, though I didn't look
very far, surely there must be other ones with higher
resolution recordings.

BTW I've just done some more tweaking and fixed a bug,
and as a result it is getting the robin song better
now, see what you think:

http://www.robertinventor.com/Robin_v2.mid

On Celeste for a bit of fun :-).

compare with the original 8 bit recording:
on this page:
http://www.scricciolo.com/eurosongs/canti.htm

European Robin Erithacus rubecula:

http://www.scricciolo.com/eurosongs/Erithacus.rubecula.wav

The one thing FTS can't do at the moment is to deal
with repeated notes with very brief rests between them.
It just runs them all together treating the rest between
the notes as a bit of interference in the signal. The thing is that it
pays no attention to the amplitude particularly, except to ignore
all information below a threshold to deal with noise.

You notice that in the robin clip that the first high
note is just a single pitch rather than a repeated
one. It should repeat it twice, each time with a
rapid decay with a fast tremulo effect, if one
listens to the recording slowed down. The Celeste
helps there by being a short duration note at least
- it actually just plays a single long note through
all that part of the song so if you play it on e.g.
whistle it puts a lot of emphasis on the note
and it changes the perceived shape of the melody line
rather a lot at that point. So playing on
Celeste helps to make the melody line sound
more similar.

Actually it would be pretty easy to
just extract a volume envelope for the entire
recording and superimpose that on the pitches
played - play the pitches all at the same volume
on say a whistle, or oboe or whatever depending
on the bird, and then use the midi expression controller
on top of that to match the original volumes
of the recording exactly.

I may give that a go, which will get the
subtleties of tremulo at least though not
vibrato or glissandi of course, and it will deal with repeated
notes at least as far as the effect on the listener
is concerened. They use vibrato of course,
but not quite as much as one might expect,
many of the songs use just pitch glissandi
and then some tremulo now and again, or
alteratively a little in the way of vibrato.
The glissandi too aren't continuous, some
notes are discrete steady pitches, and
others are glissandi and if you listen to it
slowly, it really doesn't run that many
notes together either. It's quite a bit like
human speech, with phrases with gaps between
them.

It's clear that birds do have faster reactions than
humans and live a bit faster, and I wonder if
they could be so very much faster that they
can actually hear all those details in their songs
as they sing... It is interesting to speculate.
I don't know how one could find out,.

Some birds have a lot of vibrato.
The curlew has a continuous vibrato,
and the meadow pipit also has a lot of vibrato
of the ones I've done.

BTW if you listen to the robin slowed down
it sings quite a few very steady discrete pitches though it does
have some legato / portamento type glissandi
- but lots of very steady notes amongst it.

> Plus, duration-based analysis could even tell us something about
> melodies played on keyboard instruments. There, it isn't needed to
> distinguish note transitions from scale tones, but it could tell us
> something about central vs. passing/auxillary tones -- even
> "diatonic" melodies rarely use all 7 notes equally.

> >So maybe one could take that into account. I could add an option to
> >FTS to print out all the pitches found as pitch / volume / duration
> >triples for anyone to analyse as they please using their own software
> >too.

> That would be great!

Done! I've done it so it saves them as a comma separated values
file so it shows up in database programs, and should be easy for
a program to read.

> >Well it is visible in the interface but as three rows of
> >numbers - you could just use those too copy / paste and use them
> >as input to ones program.

> It does this now? What's the download url?

I'm just getting it ready now with the new changes and
will upload it and let you know when it is ready.

Done - I've just send the url to you privately.

Note to everyone else:

This update isn't ready for release quite yet.
it will be quite soon. Meanwhile
if you are very keen just ask and you can try
it out and see what you make of the feature.
However I like to know who is testing it out
at this stage.

Iit is definitely in a state of flux, this particular
feature at least, and may possibly change, may not work in quite
the same way when it comes to the release as it does
right now. You may spend some time tweaking the
settings to get it to work well with some particular
isntrument then with the next upload everything gets
chanaged and you have to start again, for instance.

But if you are keen to give it a go anyway let me know.
BTW it does install as a separate program so you
don't need to worry about it interfering with your
installation of FTS 2.4 if you have that already.
I plan to keep it like that for the release because
there has been quite a change in some sections and
some users may well want to run both programs
concurrently until they get used to 3.0.
Hopefully they will feel it has improved and
it is easier to find ones way around it
though with even more features there
is yet more to distract one on ones searh to
find out how to do some particular thing
:-).

Robert

🔗Robert Walker <robertwalker@ntlworld.com>

4/16/2005 3:31:46 AM

Hi Jon,

Thanks, I had a look at the site and his home page
and I've bookmarked it to come back to and
read at my leisure later.

Yes I'd like to hear his "Imaginary Animals"
with the transformed Winter Wren recording.

I've tried listening to bird songs slowed
down and they sound so kind of human
or animal like in timbre at the slower speeds.
I can imagine they could morph well.

Robert

🔗Carl Lumma <ekin@lumma.org>

4/16/2005 11:55:26 AM

>Hi Carl,
>
>> It never gives two simultaneous pitches, does it? If it doesn't,
>> the volume part of the triple is not needed in technique.
>
>No, you are right, it's strictly monophonic.

That's a good thing, for my present purposes.

>I do have an frequency spectrum type pitch tracking option as well
>which could find multiple pitches. That one isn't that good at finding
>the starts and ends of notes. However what you can do with FTS is to
>edit the results after it finds the notes - can other wave to midi
>programs do this?

Many of them can, yes. The best one in this regard is Widisoft (.com).

>FTS shows blue vertical lines on the recording for all the note
>boundaries it found - and you can click anywhere to add or remove
>them, then get it to refind all the frequencies for the notes again,
>using just the note boundaries you have marked out yourself.

I'll have to check it out.

>If you do that then the frequency spectrum (FFT) version of the note
>finding is pretty good at getting the exact pitches also, because it
>uses peak interpolation methods to refine the estimate of the pitch
>at the peak. With various tweaks, it is now roughly comparable with
>the exact wave count in its precision though the exact wave count has
>a slight edge I think. Both it seems can find pitch accurate to a few
>hundredths of a cent or so for computer generated steady pitched notes
>of a second or so. The wave counting routinely manages better than a
>hundredth of a cent for a one second note if the waveform is
>suitable - e.g. computer generated exact pitches of a regular
>waveform shapge.

Wow.

>But I don't have any plans at all to attempt polyphony, have made a
>design decision in advance just not to attempt that. The task seems
>a hard one - my hunch is that it is a sort of pattern recognition
>thing like understanding speech or recognising faces which are both
>things that are particularly hard to program.

Techniques exist that'll get you 85% accuracy or something. You
can try any of the packages Yahya mentioned to see how miserable
that is. Getting 100% accuracy is perhaps an "AI-complete"
problem. My friend researches it at a startup in Palo Alto.

>The ear can hear a whole jumble of partials and instantly pick out
>that it consists of say a basson, oboe and violin playing a chord,
>while a program will have a much harder time of that because how
>would it tell which partials belong with which instruments?

The *trained* ear can do that. It's called stream separation.
It's the holy grail of auditory scene analysis.

>Maybe by fitting harmonic series for harmonic partials, but there
>may be some inharmonicity and the notes may be very well also be
>played in near to harmonic series type harmonies with each other.
>So, maybe really it would need a bank of templates of likely sounds
>which it could then try and match against the recording, so you
>could tell it that it uses such and such instruments, e.g. a guitar,
>or may have one in it, and it could then look for the characterstic
>fingerprints of that instrument.

My guess is that some notion of rhythm is essential.

>I think if one were ever to be presented with an orchestra consisting of
>entirely unfamiliar instruments, it's possible it might take one quite a
>while to learn to recognise them - if there were no familiar ones in the
>orchestra at all to get you started.

I don't doubt that fingerprinting is at work, but I don't think it's
a primary technique. The brain uses spatial and visual cues in a live
situation. Without those, it still an expectation of when to expect
the next note in each part, and least for non-serial music. If you
look at hi hats and such, their spectral profile can differ greatly
depending on how they're struck. But probably the attack envelope of
an instrument is more used for fingerprinting than its spectrum. But
who knows how the best listeners do it...

>So that's the situation that the computer program faces, no instruments
>are familiar to it at all, unless you can figure out how to program it to
>recognise them.

Some of the packages Yahya mentioned use instrument fingerprints.

>> >I don't do anything about weighting by the duration of the pitches,
>> >but one could do that too, especially since longer duration pitches
>> >are perhaps likely to be pitched more exactly, or heard with more
>> >exactness of pitch too, I mean a grace note of just a hundredth of
>> >a second for instance may not be so exactly pitched as a one second
>> >note for instance.
>
>> Indeed. And in fact, portamenti, glissandi, and even legato note
>> transitions should not contribute to the scale analysis in my view.
>
>Rightio. Another thing to bear in mind as well is that the
>pitch detection is more accurate the longer the note is
>so very short grace notes may not be recognised quite so
>accurately by FTS.

That's ok for the kind of analysis I'm doing, since I won't even
be looking at the pitches of the shorter duration notes.

>The bird song recordings I tested it with have all
>been 8-bit just because that's what I found on the
>web sites of bird song I tried, though I didn't look
>very far, surely there must be other ones with higher
>resolution recordings.
>
>BTW I've just done some more tweaking and fixed a bug,
>and as a result it is getting the robin song better
>now, see what you think:
>
>http://www.robertinventor.com/Robin_v2.mid
>
>On Celeste for a bit of fun :-).
>
>compare with the original 8 bit recording:
>on this page:
>http://www.scricciolo.com/eurosongs/canti.htm
>
>European Robin Erithacus rubecula:
>
>http://www.scricciolo.com/eurosongs/Erithacus.rubecula.wav

Needs some work, I'd say.

>The one thing FTS can't do at the moment is to deal
>with repeated notes with very brief rests between them.
>It just runs them all together treating the rest between
>the notes as a bit of interference in the signal. The thing
>is that it pays no attention to the amplitude particularly,
>except to ignore all information below a threshold to deal
>with noise.

Oh. That could be a problem.

>Actually it would be pretty easy to
>just extract a volume envelope for the entire
>recording and superimpose that on the pitches
>played - play the pitches all at the same volume
>on say a whistle, or oboe or whatever depending
>on the bird, and then use the midi expression controller
>on top of that to match the original volumes
>of the recording exactly.

Good idea!

>It's clear that birds do have faster reactions than
>humans and live a bit faster, and I wonder if
>they could be so very much faster that they
>can actually hear all those details in their songs
>as they sing... It is interesting to speculate.

Indeed.

>> Plus, duration-based analysis could even tell us something about
>> melodies played on keyboard instruments. There, it isn't needed to
>> distinguish note transitions from scale tones, but it could tell us
>> something about central vs. passing/auxillary tones -- even
>> "diatonic" melodies rarely use all 7 notes equally.
>
>> >So maybe one could take that into account. I could add an option to
>> >FTS to print out all the pitches found as pitch / volume / duration
>> >triples for anyone to analyse as they please using their own software
>> >too.
>
>> That would be great!
>
>Done! I've done it so it saves them as a comma separated values
>file so it shows up in database programs, and should be easy for
>a program to read.

Wonderful. I can probably code what I want to do in Excel!

>> >Well it is visible in the interface but as three rows of
>> >numbers - you could just use those too copy / paste and use them
>> >as input to ones program.
>
>> It does this now? What's the download url?
>
>I'm just getting it ready now with the new changes and
>will upload it and let you know when it is ready.
>
>Done - I've just send the url to you privately.

Got it.

-Carl

🔗Robert Walker <robertwalker@ntlworld.com>

4/18/2005 2:59:23 PM

Hi Carl,

> >European Robin Erithacus rubecula:
> >
> >http://www.scricciolo.com/eurosongs/Erithacus.rubecula.wav

> Needs some work, I'd say.

I agree. Here is the latest version now :-)

http://www.robertinventor.com/Robin_v3.mid

again compare
http://www.scricciolo.com/eurosongs/Erithacus.rubecula.wav

I've added expression, using the original
volume envelope, and also found a better way
of dealing with the undulations - which I think
now are created by low frequency background
noise, not necessarily the birds song as such
- this recording like most of the bird song
recordings you find has a fair bit of background
noise (in fact they seem to prefer them
with a bit of natural background noise rather than
studio clean type sounds - see this page:
http://www.naturesongs.com/recordists/nrfiles.html
with the discussion of the recording of the
two Slaty-backed Nightingale Thrushes for instance.)

Robert

🔗Robert Walker <robertwalker@ntlworld.com>

4/18/2005 4:21:57 PM

Hi Yahya

[YA] I think it was Widisoft that had FFT as its "basic" algorithm,
with a choice of three other "advanced" settings. Are they doing
anything you haven't tried?

I think mainly trying the same thing - FFT. But they don't
go into techy details. It's possible that the newer
algorithms may be using wavelet analysis, which I looked
into - but it seemed complex to use, and I got the impression
that while improving time resolution over FFT, it
might reduce the pitch resolution a bit, so it may not
be quite what I'm looking for myself as I'm aiming
for the highest possible pitch resolutions I can
achieve. But whether one could combine the two
and use wavelet analysis as a first run through to
find the note boundaries, then use FFT on a note
per note basis to find the pitches... Maybe that's
what they do. Also trying it out I found that with the rich
polyphony setting they play notes
for many of the pitches that make up the
timbre - if you use the rich polyphony settings
with a monophonic recording.

Though I won't be trying polyphony as
such, I could do that - do chords consisting of
all the partials found for each of the notes for
the FFT method, which could be fun to do, -
not of much use seriously for transcribing polyphonic music, because
you would end up with so many pitches for each
note - but it might nevertheless **sound** quite
a bit like the original recording done that way.
Again they seem to use a harmonic series type
search to find the notes in polyphonic parts,
which I also do for monophonic parts, in the case
of FTS, to try and locate the fundamental - and improve
its pitch accuracy for harmonic timbres.

> effective. ... So I think it's likely that birds CAN
hear almost anything we can extract from their
song with modern tools, at least down to a limit that
is, say, proportionate to their "rate of living". Maybe
a bird that lives seven years instead of seventy would
discriminate events ten times shorter? There's an
approximate "law" that each bird and animal has a
lifetime of about the same number of heartbeats.

It's at about tenth speed in fact that the
sounds break up into syllables and sound
like animal and human voice sounds.
Maybe that's suggestive?

> [YA] Robert, doubtless at some time in the near
future I would like to try out your innovations.
But I think I'd better wait a little for the dust
to settle ... :-)

Good idea. As you'll hear from my last post to
Carl it is taking shape but there's a bit
still to do before it is really presentable for the user.

I spent quite a few hours of tweaking the settings
before I could get it to do that one. However, it will be
available as one of the options in the drop list
of settings so a user will just need to select
it - or try various ones for particular
bird song - and that one will probably work well with
a number of other bird songs. Robert

🔗Graham Breed <gbreed@gmail.com>

4/18/2005 8:31:39 PM

Robert Walker wrote:

> I think mainly trying the same thing - FFT. But they don't
> go into techy details. It's possible that the newer
> algorithms may be using wavelet analysis, which I looked
> into - but it seemed complex to use, and I got the impression
> that while improving time resolution over FFT, it
> might reduce the pitch resolution a bit, so it may not
> be quite what I'm looking for myself as I'm aiming
> for the highest possible pitch resolutions I can
> achieve. But whether one could combine the two
> and use wavelet analysis as a first run through to
> find the note boundaries, then use FFT on a note
> per note basis to find the pitches... Maybe that's
> what they do. Wavelets mean you can use different time resolutions for different frequency bands. If you want good time and pitch resolution, you concentrate on the higher partials. I don't think there's be any need to do an FFT as well.

Graham

🔗Robert Walker <robertwalker@ntlworld.com>

4/19/2005 7:15:43 AM

Hi Graham,

> Wavelets mean you can use different time resolutions for different > frequency bands. If you want good time and pitch resolution, you > concentrate on the higher partials. I don't think there's be any need > to do an FFT as well.

Thanks for the correction.

I found this:

http://www.cnmat.berkeley.edu/~tristan/Report/node4.html

The way he puts it is that short time frame FFT
uses a single windowing size - just to keep
anyone reading this in the picture - FFT
windowing is an added fade in and fade out of
the recording - or the section of the recording
you want to analyse - the frequency analysing
software puts it in temporarily whenever
it finds a frequency spectrum. If you don't do it
then you may get artefacts introduced because
the FFT picks up on the length of the recording
itself as a frequency.

So the way he puts it is that wavelets use a frequency dependent time window, so you use
different window times for it dependent on the frequency, which makes the wavelet
idea make more sense to me now.

So the high frequencies
have good time resolution for wavelets because
the time window is short - but low pitch resolution
- and low frequencies are the other way around.

So I suppose you use the high frequencies to get
the timing information and the low frequencies to get
the precise pitch estimations.

It says there that wavelets are supposed to be a good model for human hearing, but as
we have much poorer pitch resolution for very low frequencies, its rather the other way around
from the way the wavelets work.
I wonder how that works out...

Robert

🔗Carl Lumma <ekin@lumma.org>

4/19/2005 7:44:03 AM

>It says there that wavelets are supposed to be a
>good model for human hearing, but as
>we have much poorer pitch resolution for very
>low frequencies, its rather the other way around
>from the way the wavelets work.
>I wonder how that works out...

In some loose sense, it sounds like your counting
of zero-crossings method (if I understood you right)
is closer to what the ear does. The cochlea is
closer to a bank of resonant filters, but presumably
neurons count zero crossings from each filter at a
later stage. Then there is transient detection and
spatial location, which amount to the use of the
hand-edited note transitions we were talking about.
Finally, apparently all of this information is then
used to control the system, feedback-style. The
most glaring example is the selective amplification
of certain frequencies through SOAEs (Spontaneous
OtoAcoustic Emissions).

-Carl