Structures: Performed, Perceived and Constructed

Prof. Elaine Chew, Professor of Digital Media at Queen Mary University of London’s School of Electronic Engineering and Computer Science

Conventional understanding of music structure in musicology as well as music information research limits its definition to musical form and sectional structures. But structure is more than sonata form or ABA structure. Structure refers to all manners of musical coherence as generated by surface features and deeper ones, musical entities and boundaries, movements and arrivals. Thus, long-term modulation of intensity or tension creates a form of structure (coherence), as does local weighting of notes to indicate an upbeat or downbeat, or subtle changes of color (for example, vibrato and timbral) amidst a sustained note. The human mind is wired to perceive, use, and crave structure. Grappling with ways to construct coherence is central to the work of music making, and imagining new and convincing ways to formulate musical coherence lies at the heart of musical innovation. By considering music structures as emerging from musical sense making, we open up new ways to explore and understand the manifold forms of music structure.

With this broad definition of music structure in mind, I shall survey some of our recent work* in the scientific and computational modeling and analysis of music structure as performed, music structure as perceived, and music structure as applied to composition. Structures as perceived or communicated through prosody serve to shape the meaning of the musical text; when perceived or communicated structures serve as the given information rather than the end goal in an algorithm, this dual (reverse) approach leads to interesting insights into musical sense making; when perceived or communicated structures further serve as sources for crafting new compositions, they provide important seed material for generating coherence. In addition, the musical mind imputes structure on music information. Harking back to the medieval concept of music internal to the human body (musica humana), the presentation will conclude with applications of music structure extracted from arrhythmic heartbeats.

keynote_3Prof. Elaine Chew is Professor of Digital Media at Queen Mary University of London’s School of Electronic Engineering and Computer Science, where she is affiliated with the Centre for Digital Music. Her research centers on mathematical modeling of musical prosody, structure, cognition, and interaction. She was previously Associate Professor at the University of Southern California’s Viterbi School of Engineering and Thornton School of Music, where she founded the Music Computation and Cognition research laboratory. Her work has received recognition through the NSF CAREER/PECASE awards, and fellowships at the Radcliffe Institute for Advanced Study at Harvard. She earned Ph.D. and S.M. degrees in Operations Research from MIT, and a B.A.S. in Mathematical and Computational Sciences (honors) and in Music Performance (distinction) from Stanford. She holds Fellowship and Licentiate diplomas in piano performance from Trinity College London. As a pianist, she has performed internationally as soloist and chamber musician, and she frequently collaborates with composers to commission, create, present, and record new music. Her work has been featured on Los Angeles Philharmonic’s Inside the Music series, and in an exhibit on Beautiful Science at the Huntington Library in California. She has served as a member of the MIT Music and Theater Arts Visiting Committee and the Georgia Institute of Technology School of Music External Review Committee. She is on the advisory/editorial boards of the Computer Music Journal, the Journal of Music and Mathematics, Music Theory Spectrum, and ACM Computers in Entertainment. This year, she is also a jury member of the Guthman Musical Instrument Competition and the Falling Walls Lab.

* The presentation includes joint work with Dorien Herremans, Isaac Schankler, Jordan Smith, Luwei Yang, Ashwin Krishna, Daniel Soberanes, and Matthew Ybarra.


From Stanford to Smule: reflections on the nine-year journey of a music start-up.

Dr. Jeffrey C. Smith, Co-founder, Chairman, and CEO, Smule, Inc.

Smule, a music start-up based in San Francisco, began as a conversation between Jeffrey Smith, a PhD student at Stanford’s CCRMA, and Ge Wang, a newly hired professor. Nine years later, Smule has emerged as the leading platform for music discovery and collaboration. Smule’s global community of 50 million people create 7 billion recordings each year, and upload over 36 terabytes of their music to the Smule network each day. The Smule catalog of 2 million songs, which doubled in size in the past six months, represents one of the largest corpuses of structured musical content on the Internet. This catalog includes musical backing tracks in MIDI and MP4 formats, lyrics, pitch data, timing, and musical structure. Smule generated $101M in sales in ‘16 and has 1.8 million paying subscribers to their service.

Tracing the nine-year arch of Smule from a student concept to a market leader, what can we learn about the potential symbiotic relationship between academic research and commercial innovation? What was the genesis of Smule? What was the role of students, research, Stanford, venture capital, and more broadly, Silicon Valley? What were the formative challenges the company confronted as it scaled from tens of thousands of users “blowing” air through their iPhone microphones with Smule’s Ocarina (a flute-like instrument designed for the original iPhone) in ‘08 to today, where an active community sings and plays 20M songs each day on their mobile phones, often together in collaboration? How is a musical social graph different than a social graph built around other forms of media, such as photos, video, or text? Finally, what role did MIR play in the development of the Smule business model and technology stack?

JeffreySmith_keynote2Dr. Jeffrey C. Smith, PhD, is the co-founder, Chairman, and CEO of Smule. Jeff has a BS in Computer Science from Stanford University and a PhD in computer-based music theory and acoustics (“Correlation Analyses of Encode Music Performance”) from Stanford’s CCRMA. Jeff has taught introductory computer science courses at Stanford in addition to serving as a teaching assistant in music theory and computer science. More recently, Jeff periodically teaches Music 264 at Stanford, a seminar with lab that analyzes large-scale industry data sources [including Stanford DAMP] to develop insights into musical engagement. Early in his career, Jeff worked as a software engineer at Hewlett-Packard’s language lab and IBM’s Scientific Research Center in Palo Alto. For the past twenty-five years, Jeff has served as a leader in businesses he co-founded, including Envoy (acquired by Novell ’95), Tumbleweed (NASDAQ listing in ’99), Simplify Media (acquired by Google in ’10), and for the past nine years, Smule. Jeff is the co-author of twenty-seven patents in the fields of computer music and email security. Jeff enjoys writing and playing classical piano music – he is currently immersed in Brahms Op. 9 & Op. 10.


Does MIR Stop at Retrieval?

Prof. Roger B. Dannenberg, Professor of Computer Science, Art & Music, Carnegie Mellon University, USA

The Music Information Retrieval community that formed around this conference has moved swiftly from a narrow set of concerns and problems to a much wider exploration that I have long characterized as “Music Understanding.” Music Understanding explores methods to find pattern and structure in music, ranging from low-level features, such as pitch and onsets, to high-level properties such keys, transcriptions, and yes, even genre. I believe that “Music Understanding” and the MIR community must broaden their scope even further, turning to music composition, improvisation, and production, for at least three reasons. First, attempts to automate music generation have had very limited success, so music generation is a good measure of the gap between music formalisms and our human understanding of music. Second, music generation research might lead to a better understanding of creativity, learning, and the brain. Finally, music generation has practical applications, and I will discuss one: music generation for music therapy. I will also summarize the history of computer-assisted composition, describe the state-of-the-art, and provide a critique intended to spur new research.

RogerDannenberg_keynote1Prof. Roger B. Dannenberg, PhD, is currently a Professor of Computer Science at Carnegie Mellon University with courtesy appointments in the Schools of Art and Music. Dr. Dannenberg studied at Rice University and Case-Western Reserve University before receiving a Ph.D. in Computer Science from Carnegie Mellon University. He also worked for Steve Jobs at NeXT as a member of the music group, MakeMusic’s commercialization of Dannenberg’s computer accompaniment research, and with Music Prodigy, an award-winning MIR-based music education start-up.

Dr. Dannenberg is an international leader in computer music and is well known especially for programming language design and real-time interactive systems including computer accompaniment. He and his students have introduced a number of innovations to the field: functional programming for sound synthesis and real-time interactive systems, spectral interpolation synthesis, score-following using dynamic programming, probabilistic formulations of score alignment, machine learning for style classification, score alignment using chromagrams, and bootstrap learning for onset detection. Dannenberg is also the co-creator of Audacity, an open-source audio editor used by millions.

Dr. Dannenberg is an active trumpet player and composer. His trumpet playing includes orchestral, jazz, and experimental music with electronics, and his opera, La Mare dels Peixos, co-composed with Jorge Sastre, premiered in Valencia, Spain in December 2016.