Developing Music with Software Tools



Presentation at the workshop

Contemporary Music Digital Publishing

Centro Tempo Reale, Florence, Italy

26th November 2001

rev. February 2002


Part 1: Context

1.1 Personal background

1.2 My work and the software I use

1974-1993: Handwritten scores
1981 My first personal computer (Apple II) - and my first experience of programming
1993 Stockhausen starts using computers to publish his scores - Finale, FreeHand, FreeHand Xtras...
2001 Freelance copyist ...and Sibelius

1.3 Publisher’s problems


Part 2: A proposal for a Graphic User Interface

2.1 What the user sees and does

2.2 The proposal’s background and implications

Recording studios
Borderline of perception
Independence of meaning from graphic behaviour
Notation as an aide-mémoire
High level grammars
Chunking versus quantisation
Relativity and local context
Event oriented programming
The development of agents

Part 3: Footnotes after the event

Thanks
Institutions
The development of music


Part 1: Context

Before I get to what I really want to talk about, I first have to tell you something about who I am, so that you will be better able to understand what I’m saying.

1.1 Personal background

I’m a freelance copyist and developer. I’m also a composer – I studied under Harrison Birtwistle at the Royal Academy of Music in London from 1968-1972.
I left college wanting to tackle the serious theoretical problems in music notation which had been left unsolved when the Avant-Garde collapsed in 1970.
The traditional institutional structures had proven incapable of solving these problems: Firstly, composers were expected to solve them, but composers have too many other problems to solve at the same time. They need to think about the poetic aspects of what they are doing, and their survival depends on their mastering short-term practicalities. Secondly, the other institutions (academic, government sponsoring, publishers etc.) only support rather short-term projects.
So I looked about for (and was lucky enough to find) a long-term strategy which would enable me to remain close to the things which interested me.
After working for Universal Edition in London for a couple of years, I became Karlheinz Stockhausen’s principal copyist in 1974, and made an agreement with him which enabled me to get several months in each year free to devote to my own work.
Unfortunately, at the beginning of this year (2001) Stockhausen told me that he no longer wants to continue with that agreement.

1.2 My work and the software I use

OK, so I’m a practical copyist with many years of experience creating complicated 20th century scores. Here are the relevant parts of my curriculum vitae in a little more detail.

1974-1993: Handwritten scores

For the 19 years before Stockhausen began to use computer technology to publish his scores, I copied them at a drawing board with pen and ink.

1981: My first personal computer (Apple II) - and my first experience of programming

I bought this computer because I wanted to reduce the amount of paperwork associated with an Abstract Data Type I was developing. I’m interested in algorithmic composition.

1993: Stockhausen starts using computers to publish his scores - Finale, FreeHand and Freehand Xtras...

In 1993, the then current version of Finale was hopelessly inadequate for the problems confronting us, and I had to find some answers pretty quickly in order to keep my job. I was already programming in C, and was thus able to write a special filter to improve Finale’s postscript output enough for that to be edited economically in FreeHand.
FreeHand is a very powerful graphics editor, used routinely in the advertising industry. Its a great program, with lots of money behind it - but it doesnt know anything about music, so I began writing Xtras for it when that became possible in 1995.
I have never had to deny Stockhausen any of his most extreme demands as the result of inadequate software, so I am not convinced that publisher’s basic problems have anything to do with that side of things. I’ve brought a few example scores with me, if anyone wants to look at these later. Some example pages can be found at my web site.

2001: Freelance copyist - ... and Sibelius

I’m currently looking about for new ways to finance my research & development projects. The commercial music publishing world is at present dominated by two major programs: Finale and Sibelius, and I have decided to learn the latter. I might just as easily have chosen the latest version of Finale. Probably I was irrationally influenced by the fact that Sibelius is a British program :-). Both these programs meet the minimum graphic requirements for publishing conventional music, and they can both be used to input such music in a commercially viable time - but, especially in non-conventional music, neither of them provide the precision, power and flexibility achieved by combining FreeHand with my Xtras.

1.3 Publisher’s problems

Fundamentally, I think that useful new ideas (theories) are generated by people who deal very directly with real, practical problems. Practicalities take precedence over, and are the basis for, theory.
Stockhausen's unique advantage over other music publishers is that he has been able to combine composing, performing, editing and engraving in a single work environment. He is an eminently practical person.
Other publishers’ current problems arise because their work flow has become too complicated. They dont employ copyists directly, and have failed to take advantage of new technologies to simplify things. The problems associated with complex, non-conventional scores multiply when the different departments have to work at arm's length.
So their philosophy is very conventional, and this is reflected not only in their organisational structure but also in what they publish. They are still trying to survive on the basis of a 19th century concept of what it is to be an artist. They have been surviving for many years simply by cutting costs, securing their royalties and trying to support the antiquated star system. They still try to sell their composers as stars - or heroes - in the Romantic tradition. Interestingly, the archetypical Romantic vision of a composer is Beethoven whistling in the woods. The notes somehow land on the paper by magic. Copyists dont exist. There is an aristocratic disrespect for writing and getting one’s fingers dirty, inherent in this Romantic ideal. Copyists should be invisible, definitely cheap and ideally not exist at all. Maybe the real question is whether publishers can survive without heroes.
Actually, I think that the development of music is more important than the question of what happens to publishers. If we had a culture in which written music was obviously developing, then publishers would have an easier time too. So I’d like to show you a proposal of mine for a Graphic User Interface with which music can be developed.

See footnote 3.2


Part 2: A proposal for a Graphic User Interface

This presentation develops ideas which have been evolving in my previous papers.

2.1 What the user sees and does

I think that the development of music can be led by software, not only because it reduces our dependence on institutions but also because software can be understood and used by people who understand neither how it was written nor the design decisions which have been taken. (see footnote 3.3)
So it ought to be possible for me to explain to you how you, as users, would use this interface without explaining all the details of how I arrived at the proposal in the first place.
I’ve prepared some notes on the proposal’s background and implications, so after I’ve shown you the proposal itself, we can choose one or more of these subjects as the starting point for any discussion. Of course, if anyone wants to start the discussion from anywhere else, that would be fine by me.

In fact, I spent so much time talking about the following diagrams, that there was no time for any real discussion.
There was a short exchange in which I pointed out the lack of absolute time in the upper level symbolic windows (they only contain names or labels) - which provoked Nicola Bernardini into telling an anecdote about Bruno Maderna conducting Stockhausen ("Seventy-two-point-five?"). I concluded the presentation by reading my notes on Event-oriented programming.


The highest level of structure in the following diagram is a reversible feedback loop (through User-Events-Objects-User-Objects-Events-User etc.). This feedback loop was, I think, responsible for the development of western music in the first place, and its current breakdown is responsible for the current lack of development in written music. Traditionally of course, scores have been written on two-dimensional paper, and "performance practice" has been developed and stored in the minds of composers and performers.

Figure 1

Figure 2: Event analysis ("chunking")


As far as a machine is concerned, this is a single, undifferentiated curve. People however instinctively break such curves into manageable chunks. Such chunks can be labeled just by putting a dot on each peak (the dot might be oval, like a notehead). Alternatively, the labels could be numbers or letters or duration symbols etc. giving more precise information about the event. Durations can be classified, irrespective of the existence of a tempo, using a logarithmically constructed trammel. Using the classic duration symbols means that legibility can be improved, and it becomes easy to develop closely related higher level notations.
If one were aiming to reconstruct say, the notation of a Gregorian Chant (e.g. Chartres ca. 900) in the upper level symbolic windows, then it would probably be useful to classify what one thinks of as the relevant events in some other way.

I'm using space to demonstrate the algorithm here, but non-dimensional numbers (or bits in a MIDI stream) would also work.

1. Construct the trammel. Ideally, it is constructed for maximum differentiation in the transcription. (Logarithmically constructed trammels result in conventional legibility.) There are ways to define "maximum differentiation" so that optimal trammels can be calculated by the software, on the basis of an analysis of the space-time diagram. Users should, as always, be given the chance to override such suggestions.
Note that if there were a tempo defined by the durations in the diagram, it would be very easy to construct a trammel giving a conventional transcription. The algorithm does not however assume a tempo in the original diagram.

2. Label each impulse (event) with the symbol corresponding to its length in space (in this case the distance to the next event). Beaming (higher level chunking) has been used to further increase legibility in this example.


3. The dynamic envelope curve and other time related information (e.g. duration in seconds, pitch frequency, patch etc.) can be stored in a separate data object associated with (named by) each symbol, so the symbols themselves can subsequently be spaced for maximum legibility, independently of that information (as in standard notation). The data associated with each symbol is editable in the window which opens when the user issues an edit command for that symbol.

4. The above argumentation can also be applied to pitch and dynamic symbols. In the above diagram, all the peaks are at the same height. If this were not the case, the heights could be labeled, for example with the conventional symbols for dynamics. If each event has a stable, internal frequency, that frequency can be labeled using the usual pitch symbols (using clefs, accidentals etc.).


2.2 The proposal’s background and implications

These topics are more or less in order of the length of my notes. At the top of the list, I only have a sentence or two, at the bottom there’s more. In principle, we could start talking about any of these subjects or about subjects which are not on the list. Would anyone like to start anywhere in particular?
Recording studios
Borderline of perception
Independence of meaning from graphic behaviour
Notation as an aide-mémoire
High level grammars
Chunking versus quantisation
Relativity and local context
Event oriented programming
The development of agents

Recording studios

A Graphic User Interface like this would be useful for orientation in recording studios. Symbols are easier to read than space-time diagrams.

Borderline of perception

We perceive pitch, not frequency (i.e. A-natural, not 440Hz.). We cant count the 440Hz. References to absolute time occur only in windows A1, A2 etc. where we can adjust machine-related parameters. There are no such references in the symbolic level windows, because these contain symbols which are simply the names of "subroutines". No symbol has a fixed, absolute meaning. The meanings may be arbitrarily complex, and are defined in the library modules.
A similar argument forbids the use of absolute metronome marks in the symbolic level windows.

Independence of meaning from graphic behaviour

The symbols are names whose meaning depends on the meanings and relative positions of other symbols in the vicinity. Their absolute position is independent of those meanings, and can be used for the practicalities of concentrating information within particular spaces (windows or pages) and for improving legibility.

Notation as an aide-mémoire

Music notation has from the very beginning been a reminder of what already exists (think of the history of Gregorian Chant). If no libraries exist defining symbols and their meanings, then those libraries have to be created by referring to existing events. Its only possible to work clockwise round Figure 1, so that a development spiral begins as the result of user space-time feedback, once the libraries exist.
Again, the 19th and 20th century assumption that music symbols could be given fixed, absolute meanings, meant that performance practice was not properly integrated into the notation theory. There was an inadequate idea of what expressivity is supposed to be.

High level grammars

High level grammars only develop in a written culture. A Bruckner Symphony cannot be improvised. So its important that the conventions and limitations of writing are well understood. Otherwise, as in the last hundred years, the development of written music will run into trouble. Maybe high level grammars are what music is.

Chunking versus quantisation

There’s no absolute time, or absolute meaning in the symbolic levels.
Absolute time is irrelevant to our perception of music - so it should play no part in any analysis of what the symbols mean to humans. The concept of absolute time is only relevant when dealing with physical machines. This happens at the clearly defined level.
The symbols of human languages are unlike the symbols of mathematics. Their meanings are not primitives, but complex context-related functions.

Relativity and local context

The idea of Relativity is closely related to the idea of local context – in physics, this is because there is a maximum speed for the transmission of information.
I think music’s time notation paradigm collapsed at the start of the 20th century at the same time, and for very similar reasons, as the time paradigm in physics did.
19th century notation theory, like 19th century physics, assumes the existence of absolute time, so it gets confused about the notion of “subdivision”. The 19th century symbols for the subdivision of time (tuplets) actually mean tempo relationships, so they are meaningless in tempoless music (one can only compare tempi if they exist) - this was a real conceptual mistake in the notation theory of the1950s and 60s. Using tuplets to write music which avoids perceptible tempi is simply non-sense.
Tempo and the ether: tempo was ubiquitous in 19th century music, and it plays a role there as a frame of reference not unlike the role played by the ether in physics. Many kinds of music have no tempo, but 19th and 20th century notation theory persistently assumes that an absolute tempo must exist - even if that tempo cannot be perceived!
Metronomes. In music, the only relevant durations are those in the local context and those which can be retrieved in local time from long term memory. And in music, short-term memory takes precedence. For example, even when performing music which is supposed to be at tempo metronome=100, the local tempo which is actually happening takes precedence over the absolute value stored (aproximately) in long term memory. All human space-time experience is local. We only need absolute values when dealing with machines.

Event oriented programming

The symbols of music notation are the names of concepts which have been learned by composers and performers (time is not just equivalent to a dimension of space), so music notation can be thought of as an authoring or programming language whose classic interpreters are people.
An event is the equivalent in time of an object in space, and the nested structure of the Symbolic Level Windows very much resembles the way in which current computer programs are structured, so it ought to be possible to develop this project easily using existing object-oriented programming languages.
However, there are implications for the development of programming languages here too.
The lowest level music symbols are as small as possible (single characters or simple lines). These symbols are combined two dimensionally to create larger, more complex symbols and a maximum density of legible information in the two dimensions of the page. This is important when having to read in real time, extract meanings at the highest possible level, and turn as few pages as possible while doing so.
Current computer programming languages are based on alphanumeric text, and the symbols (the names of objects or functions) are generally word-sized. The words are groups of characters chunked in a single dimension. The chunking in this case enables an unlimited number of atomic symbols to be created from a finite character set. Interestingly, such text is usually formatted in two dimensional space so as to increase (human) legibility. Contrast this situation with that of ordinary text, where a single string of words and punctuation is simply folded onto the page.
It may be possible to create specialised computer programming languages, for use outside music, in which an increased density of information is achieved because they use character-symbols arranged two-dimensionally instead of as one-dimensional word sequences. The compiler (parser, interpreter, performer) would have to be more complicated, but the script could be smaller (faster to transmit). Note that atomic symbols arranged in three or more dimensions have a still higher density of information... (I’m thinking of the structure of protiens).
I think that local context is an important concept in both space and time, and that its active application to the use of symbols, in all areas of human cognition (philosophy, literature, painting, architecture, music, computing, mathematics, the sciences etc.), would have far reaching consequences.

The Development of Agents

This is rather a big subject, and I have a separate paper here which covers it in a little more detail. Briefly, this possibility arises because in this User Interface, the score consists of programmable symbols with clearly definable meanings. The symbols are the names of chunked concepts at lower levels, and the meanings of those names are stored in libraries. So the libraries as a whole can be thought of as the interpreter of the multi-level score. This would be very difficult to achieve using standard notation, because the symbol levels are all collapsed onto the one level of a sheet of paper. And some symbols have to mean different things in different levels. In 20th century standard notation, they are even connected conceptually to absolute time - the non-symbolic machine levels.
Within the multi-level notation paradigm, it is possible to analyse the differences between lots of different performances of the same score - also by different performers. If this software existed, it would become possible to store the the mutilevel score of a Beethoven Sonata on one CD, and the Agent for performing it on another. The Agent might, for example, be the result of a contextual analysis of different performances of the piece by Alfred Brendel. The “Alfred Brendel” agent could then be used to interpret a newly discovered Beethoven Sonata which the real Alfred Brendel had never seen - and the performances would vary, just as they do with real performers.
A less ambitious but more important goal than this rather futuristic scenario could be the use of trainable Agents by composers trying to develop a performance practice for the symbols they are using. Composers know best what they mean by the symbols they use, and are thus the best people to program the meanings. If composers could present performers with demonstration recordings of how the notation should be understood, that would shorten rehearsal time considerably. The use of libraries in this situation means that performance practice could begin to develop again together with the composers’ grammar.
Notice that current recording technology uses a flat, one-dimensional data stream to store performances. This proposal has interesting implications for the recording industry - and for publishers, because they could sell their heroes directly on CD. I can already see them fighting about who owns “Stravinsky”, “Glen Gould” and “Karajan” ... :-) ...


Part 3: Footnotes after the event

3.0 Revision February 2002

This revision contains some improvements to the introduction, in particular to my current analysis of music publishers' problems. I was unhappy with the original version, and have now worked out more precisely what I think - not least as a result of the encounters at this workshop.

3.1 Thanks

First I would like to thank Nicola Bernardini and the Centro Tempo Reale for organising this workshop and inviting me to speak at it. However good communication technologies get, there will never be a substitute for talking to real people face to face. I would also like to thank the other participants - in particular Han-Wen Nienhuys, Jan Nieuwenhuizen, Heinz Stolba, and Paul Roberts - for some very lively and productive discussions.

3.2 Institutions

At this workshop I learned that I have to differentiate more carefully between institutions. I continue to be highly sceptical about large institutions, because wherever large amounts of money are organised and distributed, bureaucratic hurdles have to be set up to keep things under control. Everything gets too distracting for individuals just trying to get on with their basic work. The best strategy for individuals to adopt here is simply to ignore big institutions as far as possible - easier said than done of course.
Small, high quality institutions are something else. The rapidly changing relation of individuals to institutions is something which needs thinking about very carefully - especially taking the web into account. It seems to me that publishers may have a role in future to concentrate quality in the web, and this may also be true of research institutions like the Centro Tempo Reale. Currently, the web is pretty much a flat chaos... Maybe institutions, like symbols, should be organised in hierarchies... In the European Union, I believe they call this the “subsidiarity principle”.

3.3 The development of music

The most interesting talks I had over the weekend were in very small groups. A short question and answer session after a quick presentation isnt really ideal - I at least need a bit of time to let things sink in. Unfortunately I was not able to talk to Nicola Bernardini about his presentation because he was simply too busy, but we had an interesting email exchange after I returned to Germany. Here is the relevant part (slightly edited):

30th November 2001

James:
You were worried (as I understand it) that composers are tending to limit their thinking to the bounds set by the software they are using, so they are all beginning to sound the same.

Nicola:
Well, let's put it this way: I hate the fact that composers have, at all times in history, been perfectly on top of the technologies they were in touch with - and now they obstinately refuse to deal with anything else than a mouse. If you push on such restrictions, computers may provide an instrumentation that would allow much more powerful compositional thinking than is ever conceived today.

James
It may be possible to enable such “powerful compositional thinking” more quickly than you think. One needs the conceptual framework of course (this is what I'm working on), but once one has that its just a question of doing the programming. Programming is not so hard once one knows how the software ought to be organised.
Oddly, your idea dovetails with the idea I have of providing composers with libraries defining symbols and their performance practice: Does this mean that we might get back to a situation where there would be common performance practices evolving slowly? The end of babel? Maybe it would be a *good* thing if composers began to think more about evolving high-level grammars than about low-level sensations??

Nicola:
definitely. It would be a nice thing if musicians in general would start to understand that any technology (whether that is a programming language, a musical instrument or whatever else) is not transparent to their thinking, so they should be wary of using things that *push* them to do things in a certain way.

James:
Constraints are the prerequisite for progress. Pre 20th century composers would have been aware that they were working within evolving, highly artificial languages, but that enabled them to speak. We need software (which composers can use if they want to) which re-enables this development spiral.