Skip to content

Commit

Permalink
Squashed merge of develop for Release 4.1.
Browse files Browse the repository at this point in the history
Squashed commit of the following:

commit fd962ee
Author: Greg Chapman <[email protected]>
Date:   Thu Jan 30 11:16:17 2025 -0800

    Update READMEs, setup.py, and LICENSE for Release 4.1.

commit 9f0cf12
Author: Greg Chapman <[email protected]>
Date:   Thu Jan 30 10:51:07 2025 -0800

    Update tests to know about new symbol counting changes.

commit 08e6185
Author: Greg Chapman <[email protected]>
Date:   Thu Jan 30 10:39:39 2025 -0800

    Tests: add SER output to command line tests.

commit 2188808
Author: Greg Chapman <[email protected]>
Date:   Thu Jan 30 10:14:47 2025 -0800

    AnnExtra symbol count has len(content) now and AnnExtra symbol error count includes Levenshtein distance of content.  AnnStaffGroups are sorted now, and instead of comparing all the part indices, we compare lowest and highest.

commit 7dc9ef0
Author: Greg Chapman <[email protected]>
Date:   Tue Jan 28 16:45:47 2025 -0800

    Make sure metadata item value ends up being a string.

commit 3f78626
Author: Greg Chapman <[email protected]>
Date:   Tue Jan 28 16:37:47 2025 -0800

    More symbol count (notation_size) and symbol error count (cost) changes. Trying to make them match eachother better, and make more sense.

commit a58acf3
Author: Greg Chapman <[email protected]>
Date:   Tue Jan 28 12:33:35 2025 -0800

    Release Notes again.

commit a757ad7
Author: Greg Chapman <[email protected]>
Date:   Tue Jan 28 12:32:04 2025 -0800

    Stop assuming that the two different extras are both either a Spanner or not.  They could be one of each.

commit 3e08d22
Author: Greg Chapman <[email protected]>
Date:   Mon Jan 27 15:57:25 2025 -0800

    ReleaseNotes update.

commit 1904528
Author: Greg Chapman <[email protected]>
Date:   Sun Jan 26 12:10:21 2025 -0800

    Update ReleaseNotes 4.1

commit b060304
Author: Greg Chapman <[email protected]>
Date:   Sat Jan 25 14:01:25 2025 -0800

    musicdiff text output expected results have changed a little due to symbol counting (notation_size and cost) changes.

commit b618d74
Author: Greg Chapman <[email protected]>
Date:   Fri Jan 24 14:31:06 2025 -0800

    AnnLyric.notation_size: identifiers are only worth 1, not len(identifier).
    Print SER even if cost == 0.  Handle numSymbolsInGroundTruth being 0 without dividing by 0.

commit a6f76e9
Author: Greg Chapman <[email protected]>
Date:   Fri Jan 24 14:27:30 2025 -0800

    New release notes for v4.1.0

commit d05187d
Author: Greg Chapman <[email protected]>
Date:   Thu Jan 23 15:18:44 2025 -0800

    Another lyrics and extras adjustment (lower the costs).

commit 09e92e1
Author: Greg Chapman <[email protected]>
Date:   Thu Jan 23 15:03:42 2025 -0800

    For extras and lyrics, notation_size does not include offset/duration, and diff cost is incremented by only 1 for differences in each of those fields.

commit 80a630a
Author: Greg Chapman <[email protected]>
Date:   Thu Jan 23 12:33:15 2025 -0800

    Better notation_size and comparison cost for extras and lyrics.

commit 45b54e6
Author: Greg Chapman <[email protected]>
Date:   Tue Jan 21 14:14:03 2025 -0800

    Ignore SenzaMisuraTimeSignature (since it is displayed as no timesig at all).

commit 1d08dd9
Author: Greg Chapman <[email protected]>
Date:   Tue Jan 21 11:37:49 2025 -0800

    Refactor SER output into Visualization, and return a dict[str, str].  To print it as text, we convert to JSON and print that.

commit a176105
Author: Greg Chapman <[email protected]>
Date:   Thu Jan 16 09:03:42 2025 -0800

    Compute SER = symbolic errors/num symbols in ground truth (i.e. file2).

commit 067a96d
Author: Greg Chapman <[email protected]>
Date:   Thu Jan 16 08:56:17 2025 -0800

    Add to cost any syntax errors fixed by converter21 parse code. Some lint, too.

commit bfdcf32
Author: Greg Chapman <[email protected]>
Date:   Mon Dec 2 17:19:53 2024 -0800

    New output format "ser" that prints num errors/max num syms of the two scores.

commit 16d2603
Author: Greg Chapman <[email protected]>
Date:   Mon Dec 2 12:23:37 2024 -0800

    Always return cost in symbol errors from diff() and from musicdiff command.

commit 31b31e7
Author: Greg Chapman <[email protected]>
Date:   Sun Dec 1 21:46:53 2024 -0800

    First cut at fixing Humdrum syntax errors.

commit e54c259
Author: Greg Chapman <[email protected]>
Date:   Sun Dec 1 19:49:01 2024 -0800

    Back out that AnnStaffGroup cost change; I don't like the results, and I wasn't convinced to begin with.

commit eafb018
Author: Greg Chapman <[email protected]>
Date:   Sun Dec 1 19:40:38 2024 -0800

    More notation_size tweaks: AnnMeasure should include lyric sizes, and AnnStaffGroup should add 1 for each enclosed part/staff.

commit 8f903d9
Author: Greg Chapman <[email protected]>
Date:   Sun Dec 1 19:29:09 2024 -0800

    Fix comment typo.

commit e6922a7
Author: Greg Chapman <[email protected]>
Date:   Sun Dec 1 19:22:43 2024 -0800

    Don't precompute notation_size, cache it if it is ever computed.  Many objects never are asked their notation size, especially if the scores are very similar, so don't pay the price unless you have to (but only pay it once).

commit 352bc95
Author: Greg Chapman <[email protected]>
Date:   Wed Nov 27 17:49:58 2024 -0800

    First cut at comparing different number of parts.
  • Loading branch information
gregchapman-dev committed Jan 30, 2025
1 parent 0ba675f commit f7eefb8
Show file tree
Hide file tree
Showing 27 changed files with 567 additions and 195 deletions.
1 change: 1 addition & 0 deletions .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,7 @@ exclude-protected=_asdict,_fields,_replace,_source,_make

# Maximum number of arguments for function / method
max-args=5
max-positional-arguments=10

# maximum boolean expressions in a line (too-many-boolean-expressions)
max-bool-expr=10
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

The MIT License (MIT)
Copyright (c) 2022-2024 Francesco Foscarin, Greg Chapman
Copyright (c) 2022-2025 Francesco Foscarin, Greg Chapman

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

Expand Down
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ musicdiff is derived from: [music-score-diff](https://github.com/fosfrancesco/mu
by [Francesco Foscarin](https://github.com/fosfrancesco).

## Setup
Depends on [music21](https://pypi.org/project/music21) (version 9.1+), [numpy](https://pypi.org/project/numpy), and [converter21](https://pypi.org/project/converter21) (version 3.2+). You also will need to configure music21 (instructions [here](https://web.mit.edu/music21/doc/usersGuide/usersGuide_01_installing.html)) to display a musical score (e.g. with MuseScore). Requires Python 3.10+.
Depends on [music21](https://pypi.org/project/music21) (version 9.1+), [numpy](https://pypi.org/project/numpy), and [converter21](https://pypi.org/project/converter21) (version 3.3+). You also will need to configure music21 (instructions [here](https://web.mit.edu/music21/doc/usersGuide/usersGuide_01_installing.html)) to display a musical score (e.g. with MuseScore). Requires Python 3.10+.

## Usage
On the command line:
Expand All @@ -26,9 +26,10 @@ On the command line:
default this is ignored).
-x/--exclude one or more named details to exclude from comparison. Can be any of the
named details accepted by -i/--include.
-o/--output one or both of two output formats: text (or t) or visual (or v); the default
is visual). visual (or v) requests production of marked-up score PDFs; text
(or t) requests production of diff-like text output.
-o/--output one or more of three output formats: text (or t) or visual (or v) or ser (or s);
the default is visual). visual (or v) requests production of marked-up score
PDFs; text (or t) requests production of diff-like text output; ser (or s)
requests a JSON text output containing Symbolic Error Ratio information.

file1 first music score file to compare (any format music21 or converter21 can parse)
file2 second music score file to compare (any format music21 or converter21 can parse)
Expand Down
34 changes: 34 additions & 0 deletions ReleaseNotes_4.1.0.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Changes since 4.0.0:
Add new output option that prints JSON containing the symbolic error rate (SER =
numSymbolErrors / numSymbolsInGroundTruth) to stdout (the JSON actually
contains all three numbers). Ground truth is assumed to be the second file.
If numSymbolsInGroundTruth == 0, SER will be numSymbolErrors, to avoid divide
by zero.
Add new API Visualization.get_ser_output() that returns a dict containing the
symbolic error rate.
In support of SER, notation_sizes (a.k.a. symbol counts) and diff costs (a.k.a.
symbolic error counts) have been reviewed and updated:
AnnNote.notation_size(): add 1 symbol for slash on grace note
AnnExtra.notation_size(): 1 symbol for the text, add 1 symbol if there is any
style specified
AnnExtra diff error count: text diff is 1 symbol error, offset diff is 1 symbol
error, duration diff is 1 symbol error, style diff is 1 symbol error
AnnLyric.notation_size(): use len(text) as symbol count instead of 1;
add 1 symbol if there's a verse number;
add 1 symbol if there's a verse identifier different from the number;
add 1 symbol if styled
AnnLyric diff cost: text diff symbol error count is the Levenshtein distance,
verse number diff is 1 symbol error, verse identifier diff is 1 symbol
error, offset diff is 1 symbol error, style diff is 1 symbol error
AnnMeasure.notation_size(): not just notes' symbols and extras' symbols, add in
the lyrics' symbols
AnnScore.notation_size(): not just parts' symbols, add in staff_groups' symbols
and metadata_items' symbols
Add support for comparing scores that have different number of parts (this previously
caused a failure). The existing parts are assumed to line up by index (as before,
score1 part 0 is compared with score2 part 0), and then we generate edits that
either delete the extra parts in score1, or add the extra parts in score2. The
number of symbol errors for those edits is simply the notation_size of (the
number of symbols in) the added or deleted parts.
Several smallish bugfixes.

52 changes: 40 additions & 12 deletions musicdiff/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@

import sys
import os
import json
import typing as t
from pathlib import Path

Expand Down Expand Up @@ -52,6 +53,8 @@ def diff(
force_parse: bool = True,
visualize_diffs: bool = True,
print_text_output: bool = False,
print_ser_output: bool = False,
fix_first_file_syntax: bool = False,
detail: DetailLevel | int = DetailLevel.Default
) -> int | None:
'''
Expand All @@ -77,6 +80,16 @@ def diff(
visualize_diffs (bool): Whether or not to render diffs as marked up PDFs. If False,
the only result of the call will be the return value (the number of differences).
(default is True)
print_text_output (bool): Whether or not to print diffs in diff-like text to stdout.
(default is False)
print_ser_output (bool): Whether or not to print the symbolic error rate (SER),
which is computed as number of symbolic errors divided by the max number of
symbols in the two scores.
(default is False)
fix_first_file_syntax (bool): Whether to attempt to fix syntax errors in the first
file (and add the number of such fixes to the returned number of edits/cost in
symbol errors).
(default is False)
detail (DetailLevel | int): What level of detail to use during the diff.
Can be DecoratedNotesAndRests, OtherObjects, AllObjects, Default (currently
AllObjects), or any combination (with | or &~) of those or NotesAndRests,
Expand All @@ -85,8 +98,9 @@ def diff(
Style, Metadata, or Voicing.
Returns:
int | None: The number of differences found (0 means the scores were identical,
None means the diff failed)
int | None: The total cost of the edits, i.e. the number of individual symbols
that must be added or deleted. (0 means that the scores were identical, and
None means that one or more of the input files failed to parse.)
'''
# Use the Humdrum/MEI importers from converter21 in place of the ones in music21...
# Comment out this line to go back to music21's built-in Humdrum/MEI importers.
Expand Down Expand Up @@ -130,7 +144,11 @@ def diff(
if not badArg1:
# pylint: disable=broad-except
try:
sc = m21.converter.parse(score1, forceSource=force_parse)
sc = m21.converter.parse(
score1,
forceSource=force_parse,
acceptSyntaxErrors=fix_first_file_syntax
)
if t.TYPE_CHECKING:
assert isinstance(sc, m21.stream.Score)
score1 = sc
Expand Down Expand Up @@ -176,11 +194,10 @@ def diff(
annotated_score2: AnnScore = AnnScore(score2, detail)

diff_list: list
_cost: int
diff_list, _cost = Comparison.annotated_scores_diff(annotated_score1, annotated_score2)
cost: int
diff_list, cost = Comparison.annotated_scores_diff(annotated_score1, annotated_score2)

numDiffs: int = len(diff_list)
if numDiffs != 0:
if cost != 0:
if visualize_diffs:
# you can change these three colors as you like...
# Visualization.INSERTED_COLOR = 'red'
Expand All @@ -194,10 +211,21 @@ def diff(
# 'score1 ' and 'score2 ', respectively, so you can see which is which.
Visualization.show_diffs(score1, score2, out_path1, out_path2)

if print_text_output:
text_output: str = Visualization.get_text_output(
score1, score2, diff_list, score1Name=score1Name, score2Name=score2Name
)
if print_ser_output:
ser_output: dict = Visualization.get_ser_output(
cost, annotated_score2
)
jsonStr: str = json.dumps(ser_output, indent=4)
print(jsonStr)

if print_text_output:
text_output: str = Visualization.get_text_output(
score1, score2, diff_list, score1Name=score1Name, score2Name=score2Name
)
if text_output:
if print_ser_output and print_text_output:
# put a blank line between them
print('')
print(text_output)

return numDiffs
return cost
31 changes: 24 additions & 7 deletions musicdiff/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,10 +106,23 @@
"--output",
default=["visual"],
nargs="*",
choices=["visual", "v", "text", "t"],
choices=["visual", "v", "text", "t", "ser", "s"],
help="'visual'/'v' is marked up scores, rendered to PDFs;"
+ " 'text'/'t' is diff-like, written to stdout."
+ " Either, both, or neither can be requested."
+ " 'text'/'t' is diff-like, written to stdout;"
+ " 'ser'/'s is the symbolic error rate (symbol errors/total symbols),"
+ " written to stdout."
+ " Any, all, or none of these can be requested."
)

parser.add_argument(
"--fix_first_file_syntax",
action='store_true',
help="If set, syntax errors in the first input file will be fixed"
+ " (if possible) so the diff can continue. Any fixes will be"
+ " added to the returned cost in symbol errors). Note that errors"
+ " in the second file (assumed to be the ground truth) are never"
+ " corrected. Note also that this currently only works for Humdrum"
+ " **kern files."
)

args = parser.parse_args()
Expand Down Expand Up @@ -222,16 +235,20 @@

visualize_diffs: bool = "visual" in args.output or "v" in args.output
print_text_output: bool = "text" in args.output or "t" in args.output
print_ser_output: bool = "ser" in args.output or "s" in args.output
fix_first_file_syntax: bool = args.fix_first_file_syntax is True

numDiffs: int | None = diff(
cost: int | None = diff(
args.file1,
args.file2,
detail=detail,
visualize_diffs=visualize_diffs,
print_text_output=print_text_output
print_text_output=print_text_output,
print_ser_output=print_ser_output,
fix_first_file_syntax=fix_first_file_syntax,
)

if numDiffs is None:
if cost is None:
print('musicdiff failed.', file=sys.stderr)
elif numDiffs == 0:
elif cost == 0:
print(f'Scores in {args.file1} and {args.file2} are identical.', file=sys.stderr)
Loading

0 comments on commit f7eefb8

Please sign in to comment.