I think you're right, and other posts here point out that you need context for this problem. A good dataset of the type I was referring to, would take a score and simply remove the chord labels and try to infer them for the whole piece.
... while inferring from the overall feel of the piece when and where the piece was composed, and what compositional frameworks the composer may be using, so that it can use the correct dialect.
Good chord symbols indicate not only function, but also the composer's intention.
Debussy probably needs different chord symbols from a 1930s Broadway tune, which needs different chord symbols from 1960s jazz leadsheet, which needs different chord symbols for a 21st century jazz composition inspired by Dave Leibman's See "A Chromatic Approach to Jazz Harmony", which is where chord symbols go do die. Seriously. Liebman's chord notation style for quartal chords is influential, if not authoritative.