Duplicate detection in facsimile scans of early printed music

Rhodes, Christophe; Crawford, Tim and d'Inverno, Mark. 2014. 'Duplicate detection in facsimile scans of early printed music'. In: European Conference on Data Analysis. Bremen, Germany 2 - 4 July 2014. [Conference or Workshop Item]

Text (Duplicate detection in facsimile scans of early printed music)
COM-Rhodes2014a.pdf - Submitted Version

Download (24kB) | Preview
Text (Duplicate detection in facsimile scans of early printed music)
deduplication.pdf - Accepted Version

Download (17MB) | Preview

Abstract or Description

There is a growing number of collections of readily-available scanned musical documents, whether generated and managed by libraries, research projects or volunteer efforts. They are typically digital images; for computational musicology we also need the musical data in machine-readable form. Optical Music Recognition (OMR) can be used on printed music, but is prone to error, depending on document condition and the quality of intermediate stages in the digitization process such as archival photographs. In performing OMR on the British Library’s Early Music Online collection (Pugin and Crawford, 2013) of 16th century volumes we must deal with the problem of images which are rescans of the same pages. These images are not precise digital duplicates of each other, and so must be detected through some approximate means. As well as duplicate scans, there are other forms of similarity present in the collection, such as musical relatedness and movable type reuse. We present our work on developing and combining image-based near-duplicate detection, based on SIFT features (Lowe, 1999), with OMR-based musical content near-duplicate detection. We evaluate an order-statistic based method for finding duplicate scans of pages, and additionally identify a number of distinct kinds of approximate similarity from our distance measures: substantial reuse of graphical material; musical quotation; and title page detection.

Item Type:

Conference or Workshop Item (Paper)

Identification Number (DOI):



music; optical music recognition; clustering; similarity measures

Related URLs:

Departments, Centres and Research Units:

Computing > Intelligent Sound and Music Systems (ISMS)


July 2014Accepted
4 August 2014Published

Event Location:

Bremen, Germany

Date range:

2 - 4 July 2014

Item ID:


Date Deposited:

04 Apr 2014 10:10

Last Modified:

29 Apr 2020 15:58



View statistics for this item...

Edit Record Edit Record (login required)