back to list

Diatonic Key Detection

🔗Gary Morrison <71670.2576@...>

10/21/1996 10:17:15 PM
I'm pondering a problem that, while not tuning itself, is certainly related
to the field of tuning: "Here's a MIDI file of some music; what key is it in?"
Or more specifically, which key signature it's in?

Two very different ways of doing that came to mind, and perhaps a
heuristically reasonable weighting of the two would make the best sense. Or
perhaps somebody out there has a better algorithm?

The first was to create an array of pitch classes in the order along the
circle of fifths (i.e., not C C# D..., but ... Eb Bb F C G D A...), tally up the
computer equivalent of tick-marks each time a pitch class is used, and then
choose the key from the most commonly used seven adjacent notes.

The tricky part of in this algorithm is where to tally up enharmonic
equivalents. In concept you'd probably be best to have a 19-pitch-class array
(Cb Gb Db Ab Eb Bb F C G D A E B F# C# G# D# A# E#), and tally up a note as all
of its enharmonic equivalents. That is, if an Eb key goes down, tally it up
both as an Eb and as a D#. If the music being examined is in the key of A major
for example, the resulting histogram will show:
1. A whole lot of notes in D-G# region,
2. A substantial, but overall lower-commonness, "alias" in the Cb-Ab part of
the array, and
3. A few stray accidentals largely down in the proverbial noise, and mostly
clustering around the main D-G# region.

The other method that came to mind was somewhat more like what a human would
concentrate on: To look at structurally significant spots in the sequence -
cadential points in short - and attempt to choose a tonal center around that.
The obvious problem with that is that it's very difficult to automate to say the
least. It's essentially an artificial intelligence problem rather than an
algorithmic problem.

But perhaps some basic heuristics of this artificial-intelligence-like
approach can be used to augment the first approach. Most importantly to prevent
there being an "off-by-one" ambiguity sort of problem. In particular, what pops
into mind are:
1. Tally up fewer, or no points at all, for clearly nondiatonic constructs.
For example, if a note is a chromatic passing tone a half-step away from
both of its neighbors, give it no points at all. Or only give it half a
point if it is a minor third above the preceding note and a minor third
below the next (or the descending equivalent).
2. Tally up more points for structurally significant intervals, like 5ths.
3. Tally up more points for the notes toward the beginning and end of the
fragment.
4. Give slightly more points to notes toward the middle of the circle (why
play something in D# major instead of Eb major?!).
5. Tally up more points for notes with longer durations, or higher
velocities. This rule could be risky though, since there's strong value
in accenting (agogically or by volume) chromatically altered notes for
drama. On the other hand, failure to do this could be even more risky,
in that one innocent little #4 to 5 trill could easily make the tally for
#4 four or five times that for any of the diatonic notes! That would
obviously throw the results totally out of whack.

Does that seem like a reasonable approach? Anybody have a more elegant
solution. Or perhaps other heuristic refinements to consider?

Actually, the problem I'm interested in is not exactly key-signature
detection, but a closely related problem: Diatonic transposition. By that I
mean pushing up by a diatonic third for example, predominantly diatonic music
fragment. So, in this example, scale degree 2 goes up a minor third to 4, while
5 goes up a major third to 7. Perhaps there's a more elegant algorithm for that
problem than my assumed solution of finding the key and moving the original
notes diatonically within that key?


Received: from ns.ezh.nl [137.174.112.59] by vbv40.ezh.nl
with SMTP-OpenVMS via TCP/IP; Tue, 22 Oct 1996 10:35 +0200
Received: by ns.ezh.nl; (5.65v3.2/1.3/10May95) id AA06595; Tue, 22 Oct 1996 10:36:51 +0200
Received: from eartha.mills.edu by ns (smtpxd); id XA06518
Received: from by eartha.mills.edu via SMTP (940816.SGI.8.6.9/930416.SGI)
for id BAA10017; Tue, 22 Oct 1996 01:36:46 -0700
Date: Tue, 22 Oct 1996 01:36:46 -0700
Message-Id: <009AA38D0BC028E0.3FB8@vbv40.ezh.nl>
Errors-To: madole@ella.mills.edu
Reply-To: tuning@eartha.mills.edu
Originator: tuning@eartha.mills.edu
Sender: tuning@eartha.mills.edu