The MusicCaps dataset is a collection of 5,521 music examples, each of which is labeled with an English aspect list and a free text caption written by musicians. The aspect list is a set of descriptive terms that describe the music, such as "pop," "rock," "jazz," "classical," and "electronic." The free text caption is a…