Research
In the past I have worked with music content analysis with signal processing methods. Currently, my work is related to more general audio signal processing.
The research focus of my post-graduate studies was transcribing the drum tracks of normal polyphonic music automatically, and analysing the sectional form of musical pieces. On a more general level I am interested in audio content analysis with signal processing methods (computer audition) including modelling of the auditory process, feature extraction, supervised and unsupervised classification, higher-level modelling, and more complex analysis algorithm design. Currently I am interested in applying all the methods learned so far in some new problems, so if you have an interesting problem, maybe we could collaborate to solve it.
-
Low-level analysis signal analysis. Naturally, the transcription starts by analysing the acoustic signal and creating some sort of a mid-level representation of it. Both supervised and unsupervised methods can be utilised depending on the input signal and desired result.
-
Musicological modelling. Speech recognition has gained considerably from the use of language models. Still, majority of the automatic music transcription systems try to cope only with low-level recognition. It would seem reasonable that the modelling of even simple musicological "rules" would benefit the whole transcription task. Analysis of musical metrical structure is helpful here.
-
Music structure analysis. Especially popular music pieces have a distinct structure defined by repetitions of different parts (e.g., verse and chorus). Being able to infer the structure from the audio enables several applications, such as easier navigation within the piece, music thumbnailing, and mash-ups.
-
Music and audio applications. Music information retrieval (also other aspects than just transcription), algorithm and system implementation (something running only in Matlab has very little value for normal people), systems design (combining existing analysis components, sometimes 1+1>2).
Publications
Here is a list of my academic publications. Pdf versions of some of the publications are provided here for private use only. See each publication for more detailed copyright information. All presentation slides and posters are provided also for private use only. And here is the mandatory disclaimer for IEEE copyrighted material:
- This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
For the citation information, you can check, e.g., my Google Scholar page.
Journal articles
-
Paulus, J., Klapuri, A., "Drum sound detection in polyphonic music with hidden Markov models", EURASIP Journal on Audio, Speech, and Music Processing. Volume 2009 (2009), Article ID 497292, 9 pages. (pdf). DOI:10.1155/2009/497292
-
Paulus, J., Klapuri, A., "Music Structure Analysis Using a Probabilistic Fitness Measure and a Greedy Search Algorithm", IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 6, August 2009, pp. 1159-1170. (pdf@IEEE), (full info), (pdf). DOI:10.1109/TASL.2009.2020533
-
Note: equation (25) should have |F_A| in the denominator, so that the equation is:
- Copyright 2009 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Book chapters
-
FitzGerald, D., Paulus, J. "Unpitched Percussion Transcription", in "Signal Processing Methods for Music Transcription", eds. Klapuri, A., Davy, M., Springer-Verlag, pp. 131-162, 2006. (pdf). DOI:10.1007/0-387-32845-9_5
-
Paulus, J., Klapuri, A. "Labelling the structural parts of a music piece with Markov models", in Computer Music Modeling and Retrieval: Genesis of Meaning in Sound and Music - 5th International Symposium, CMMR 2008 Copenhagen, Denmark, May 19-23, 2008, Revised Papers, Lecture Notes in Computer Science 5493, eds. Ystad, S., Kronland-Martinet, R., and Jensen, K., Springer Berlin / Heidelberg, 2009, pp. 166-176. (pdf @ Springer) (pdf). DOI:10.1007/978-3-642-02518-1_11
Theses
-
Paulus, J., "Signal Processing Methods for Drum Transcription and Music Structure Analysis". D.Sc. thesis, Tampere University of Technology, Tampere, Finland, December 2009. (pdf) (pdf @ TUT library). Lectio praecursoria slides and text (in Finnish).
-
Paulus, J., "Sähkömittarin kulutustietojen välittäminen Bluetooth-tekniikan avulla (Relaying Electricity Consumption Information Using Bluetooth-Technology)", M.Sc. thesis, Tampere University of Technology, Tampere, Finland, November 2001. In Finnish.
Conference (convention and workshop) publications
-
Jokinen, E., Paulus, J., Bäckström, T., "Intelligibility enhancement of foreground speech in stereo material for audio-visual applications", in Interspeech (Interspeech2016), San Francisco, CA, USA, September 8 - 12, 2016, to appear.
-
Dittmar, C., Driedger, J., Müller, M., Paulus, J., "An Experimental Approach to Generalized Wiener Filtering in Music Source Separation", in Proc. of European Signal Processing Conference (EUSIPCO2016), Budapest, Hungary, August 29 - September 2, 2016, to appear.
-
Paulus, J., Uhle, C., Herre, J., Höpfel, M., "A study on the preferred level of late reverberation in speech and music", in Proc. of the AES 60th Conference on Dereverberation and Reverberation of Audio, Music, and Speech (DREAMS), Leuven, Belgium, February 3-5, 2016, (pdf), (presentation).
-
Murtaza, A., Herre, J., Paulus, J., Terentiv L., Fuchs, H., Disch, S., "ISO/MPEG-H 3D Audio: SAOC 3D decoding and rendering", in Proc. of the 139th Audio Engineering Society Convention (AES 139th), New York, USA, October 29 - November 4, 2015, (pdf).
-
Paulus, J., "Perceptual loudness compensation in interactive object-based audio coding systems", in Proc. of European Signal Processing Conference (EUSIPCO2015), Nice, France, August 31 - September 4, 2015, pp. 579-583, (pdf), (pdf), (poster).
- © 2015 European Association for Signal Processing. First published in the Proceedings of the 23rd European Signal Processing Conference (EUSIPCO-2015) in 2015, published by EURASIP.
-
Paulus, J., Herre, J., Murtaza, A., Terentiv L., Fuchs, H., Disch, S., Ridderbusch, F., "MPEG-D Spatial Audio Object Coding for Dialogue Enhancement (SAOC-DE)", in Proc. of the 138th Audio Engineering Society Convention (AES 138th), Warsaw, Poland, May 7-10, 2015, (pdf), (errata), (presentation).
-
Uhle, C., Paulus, J., Herre, J., "Predicting the perceived level of late reverberation using computational models of loudness", in Proc. of the 17th International Conference on Digital Signal Processing (DSP2011), Corfu, Greece, July 6-8, 2011, (pdf), DOI:10.1109/ICDSP.2011.6004990.
-
Paulus, J., Uhle, C., Herre, J., "Perceived level of late reverberation in speech and music", in Proc. of the 130th Audio Engineering Society Convention (AES 130th), London, UK, May 13-16, 2011, (pdf), (pdf), (poster).
-
Paulus, J., Müller, M., Klapuri, A., "Audio-based music structure analysis", in Proc. of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010), Utrecht, The Netherlands, August 8-13, 2010, pp. 625-636, (pdf), (pdf), (presentation).
-
Paulus, J., "Improving Markov model-based music piece structure labelling with acoustic information", in Proc. of the 11th International Society for Music Music Information Retrieval Conference (ISMIR 2010), Utrecht, The Netherlands, August 8-13, 2010, pp. 303-308, (pdf), (pdf), (poster).
-
Alves, D., Paulus, J., Fonseca, J., "Drum transcription from multichannel recordings with non-negative matrix factorization", in Proc. of the 17th European Signal Processing Conference (EUSIPCO2009), Glasgow, Scotland, UK, August 24-28, 2009, pp. 894-898, (pdf), (pdf), (poster).
- © 2009 European Association for Signal Processing. First published in the Proceedings of the 17th European Signal Processing Conference (EUSIPCO-2009) in 2009, published by EURASIP.
-
Paulus, J., Klapuri, A., "Music Structure Analysis Using a Probabilistic Fitness Measure And an Integrated Musicological Model", in Proc. of the 9th International Conference on Music Information Retrieval (ISMIR 2008), Philadelphia, Pennsylvania, USA, September 14-18, 2008, pp. 369-374, (pdf), (pdf), (poster).
-
Paulus, J., Klapuri, A., "Acoustic Features for Music Piece Structure Analysis", in Proc. of the 11th International Conference on Digital Audio Effects (DAFx-08), Espoo, Finland, September 1-4, 2008, pp. 309-312, (pdf), (pdf), (poster).
-
Ryynänen, M., Virtanen, T., Paulus, J., Klapuri, A., "Accompaniment separation and karaoke application based on automatic melody transcription",, in Proc. of the IEEE International Conference on Multimedia & Expo (ICME), Hannover, Germany, June 23-26, 2008, pp. 1417-1420, (pdf). DOI:10.1109/ICME.2008.4607710
- Copyright 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
-
Paulus, J., Klapuri, A., "Labelling the Structural Parts of a Music Piece with Markov Models", in Proc. of the 2008 Computers in Music Modeling and Retrieval Conference (CMMR 2008), Copenhagen, Denmark, May 19-23, 2008, pp. 137-147, (pdf), (presentation).
-
Paulus, J., Klapuri, A., "Combining temporal and spectral features in HMM-based drum transcription", in Proc. of the 8th International Conference on Music Information Retrieval (ISMIR 2007), Vienna, Austria, September 23-27, 2007, pp. 225-228, (pdf), (pdf), (presentation).
-
Paulus, J., Klapuri, A., "Music Structure Analysis by Finding Repeated Parts", in Proc. of the 1st Audio and Music Computing for Multimedia Workshop (AMCMM2006), Santa Barbara, California, USA, October 27, 2006, pp. 59-68, (pdf), (ACM portal), (presentation). © Copyright 2006 by ACM, Inc. DOI:10.1145/1178723.1178733
-
Paulus, J., "Acoustic Modelling of Drum Sounds With Hidden Markov Models for Music Transcription",in Proc. of 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP2006), Toulouse, France, May 14-19, 2006, (IEEXplore), (pdf), (poster). DOI:10.1109/ICASSP.2006.1661257
- Copyright 2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
-
Paulus, J., Virtanen, T., "Drum Transcription with Non-negative Spectrogram Factorisation", in Proc. of 13th European Signal Processing Conference (EUSIPCO2005) Antalya, Turkey, September 4-8, 2005, (pdf), (poster)
- © 2005 European Association for Signal Processing. First published in the Proceedings of the 13th European Signal Processing Conference (EUSIPCO-2005) in 2005, published by EURASIP.
-
Paulus J., Klapuri A., "Model-based Event Labeling in the Transcription of Percussive Audio Signals", in Proc. of 6th International Conference on Digital Audio Effects (DAFx-03), London, UK, September 8-11, 2003, pp. 73-77, (pdf), (pdf), (presentation).
-
Paulus, J., Klapuri, A., "Conventional and Periodic N-grams in the Transcription of Drum Sequences", in Proc. of IEEE International Conference on Multimedia and Expo (ICME03), Baltimore, USA, July 6-9 2003, pp. 737-740. (IEEEXplore), (pdf), (poster), (errata). DOI:10.1109/ICME.2003.1221722
- Copyright 2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE.
-
Paulus, J., Klapuri A. "Measuring the Similarity of Rhythmic Patterns", in Proc. of Third International Conference on Music Information Retrieval (ISMIR2002), Paris, France, October 13-17, 2002, pp. 150-156, (pdf), (pdf@ismir.net), (presentation).
Other
-
Paulus, J., Disch, S., Fuchs, H., Grill, B., Hellmuth, O., Murtaza, A., Ridderbusch, F., Terentiv, L. "Decoder, encoder and method for informed loudness estimation in object-based audio coding systems", patent application PCT/EP2014/075787, WIPO publication WO/2015/078956, pub. date 4.6.2015.
-
Paulus, J., Disch, S., Fuchs, H., Grill, B., Hellmuth, O., Murtaza, A., Ridderbusch, F., Terentiv, L. "Decoder, encoder and method for informed loudness estimation employing by-pass audio object signals in object-based audio coding systems", patent application PCT/EP2014/075801, WIPO publication WO/2015/078964, pub. date 4.6.2015.
-
Paulus, J., Fuchs, H., Hellmuth, O., Murtaza, A., Ridderbusch, F., Terentiv, L. "Apparatus and method for decoding an encoded audio signal to obtain modified output signals", patent application PCT/EP2014/065533, WIPO publication WO/2015/011054, pub. date 29.1.2015.
-
Disch, S., Fuchs, H., Hellmuth, O., Herre, J., Murtaza, A., Ridderbusch, F., Paulus, J., Terentiv, L. "Apparatus and method for realizing a SAOC downmix of 3D audio content", patent application PCT/EP2014/065290, WIPO publication WO/2015/010999, pub. date 29.1.2015.
-
Disch, S., Fuchs, H., Hellmuth, O., Herre, J., Murtaza, A., Ridderbusch, F., Paulus, J., Terentiv, L. "Apparatus and method for enhanced spatial audio object coding", patent application PCT/EP2014/065427, WIPO publication WO/2015/011024, pub. date 29.1.2015.
-
Disch, S., Fuchs, H., Hellmuth, O., Herre, J., Murtaza, A., Paulus, J., Terentiv, L., Ridderbusch, F. "Multi-channel decorrelator, multi-channel audio decoder, multi-channel audio encoder, methods and computer program using a premix of decorrelator input signals", patent application PCT/EP2014/065395, WIPO publication WO/2015/011014, pub. date 29.1.2015.
-
Disch, S., Fuchs, H., Hellmuth, O., Herre, J., Murtaza, A., Paulus, J., Terentiv, L., Ridderbusch, F. "Multi-channel audio decoder, multi-channel audio encoder, methods, computer program and encoded audio representation using a decorrelation of rendered audio signals", patent application PCT/EP2014/065397, WIPO publication WO/2015/01101, pub. date 29.1.2015.
-
Disch, S., Paulus, J., Kastner, T. "Audio object separation from mixture signal using object-specific time/frequency resolutions", patent application PCT/EP2014/059570, WIPO publication WO/2014/184115, pub. date 20.11.2014.
-
Disch, S., Fuchs, H., Paulus, J., Terentiv, L., Hellmuth, O., Herre, J., "Encoder, decoder and methods for backward compatible multi-resolution spatial-audio-object-coding", patent application PCT/EP2013/070533, WIPO publication WO/2014/053537, pub. date 10.4.2014.
-
Disch, S., Paulus, J., Edler, B., Hellmuth, O., Herre, J., Kastner, T., "Encoder, decoder and methods for signal-dependent zoom-transform in spatial audio object coding", patent application PCT/EP2013/070550, WIPO publication WO/2014/053547, pub. date 10.4.2014.
-
Disch, S., Paulus, J., Edler, B., Hellmuth, O., Herre, J., Kastner, T., "Encoder, decoder and methods for backward compatible dynamic adaptation of time/frequency resolution in spatial-audio-object-coding", patent application PCT/EP2013/070551, WIPO publication WO/2014/053548, pub. date 10.4.2014.
-
Kastner, T., Herre, J. Paulus, J., Terentiv, L., Hellmuth, O., Fuchs, H. "Encoder, decoder, system and method employing a residual concept for parametric audio object coding", patent application PCT/EP2013/057932, WIPO publication WO/2014/023443, pub. date 13.2.2014.
-
Kastner, T., Herre, J., Terentiv, L., Hellmuth, O., Paulus, J., Ridderbusch, F., "Apparatus and methods for adapting audio information in spatial audio object coding", patent application PCT/EP2013/063703, WIPO publication WO/2014/023477, publication date 13.2.2014.
-
Uhle, C., Paulus, J., Herre, J., Prokein, P., Hellmuth O., "Apparatus and method for determining a measure for a perceived level of reverberation, audio processor and method for processing signal", patent application, PCT/EP2012/053193, WIPO publication WO/2012/116934, pub. date 7.9.2012.
-
Paulus, J., Klapuri, A., "Music structure analysis with a probabilistic fitness function in MIREX2009", extended abstract describing the method submitted for the Structural segmentation task at the Fifth Music Information Retrieval Evaluation eXchange (MIREX2009), Kobe, Japan, October 26-30, 2009. (pdf), (poster).
-
Ryynänen M., Virtanen, T., Paulus, J., Klapuri, A., "Method for processing melody", Finnish patent application no. 20075737, filed October 2007.
-
Paulus, J., "Drum Transcription from Polyphonic Music with Instrument-wise Hidden Markov Models", extended abstract describing the algorithm submitted to the contest, in online Proc. of 1st Annual Music Information Retrieval Evaluation eXchange (MIREX'05), London, UK, September 11-15, 2005 (pdf)
BibTeX of my publications.
Something else
Hacks
These are some notes I've made mainly for myself when tinkering with some small pieces of software and/or hardware for my own amusement. This should be noted when reading the provided information. (Also, should you try to follow the instructions or utilise any of the provided info in any way, you are doing it with your own risk.)
-
Few collected tips on LaTeX use.
-
Matlab mex playsnd() using PortAudio and allowing user interrupts.
-
Interpretation of ASID protocol used in SidStation. This can be used to play the tunes on the open MIDIbox SID platform.
-
How to compile VST SDK plug-ins with DevC++
-
Internals of an FRWD F-series GPS (If you have any idea how to improve the reception of the module, contact me.)
-
Python script polling the presence of known Bluetooth devices. The poll uses SDP to reveal also the presence of devices in hidden mode. Simpler version.
-
Quite simple Python script to fetch disc info from CDDB and insert the info into a MySQL db (emulating the db insert feature of cddb.pl)
The full updated version of CDDB.py packet for Win32, including Pythonized MCI wrapper and slightly modified method for disc id calculation (in my experiments, this gives more often the correct id).
-
How to use USB PCSC smart card reader with NSLU2.
-
How to install Debian Etch to an ARM system running in qemu.
-
suomi.locale for DBox2 Yadi 2.2.0.5
-
Story of Iiro, a sculpture reacting to audio input with visual responses. A gift to a colleague.
-
Finnish Multilingual keyboard layout for Windows OS
-
Laughing Man with pyOpenCV and PIL
Contact
You should be able to contact me via email at jouni<dot>paulus<at>iki<dot>fi.
-- paulus - 1.6.2016