I use a range of tracks, usually starting with something like Paramore: hallelujah so I can rule out muddyness which is my most hated characteristic. Then a good recording of Schubert: Death and the maiden because there's so much texture in a string quartet and I like a system that can put me in the room with them. Piano and vocal stuff comes next, to catch out overly clinical systems that might have otherwise done well in the last two tracks, there's a ton of examples - Leona Lewis: Run is an easy choice though it generally sounds good on a lot of systems so perhaps something else would be better. Then some all rounder stuff like Britten: War requiem in particular the sanctus, which tests high ends, lows, solo and ensembl vocals and goes right from gentle bell to full orchestra and chorus at full volume, Evanesence which is kind of heavy metal + vocal, and if I'm feeling bored some dance stuff from David Guetta perhaps, though again I find that kind of thing sounds good on most systems so it's not much of a differentiator (within the kind of setups I'm looking at/within budget!).
As the wiser ones have no doubt already said, different people have different priorities. For me testing is mainly about ruling out characteristics that I don't like. Once I've done that I can sit back and enjoy back to back comparisons of my type of music to find which I enjoy more.
Case in point (though this may result in me no longer being taken seriously on these forums) I've just bought an Arcam rCube. Perhaps I should have auditioned more stereo hi-fi's + bookshelf speakers, but there wasn't a lot of availability and the main competition from the local specialists recommendations ended up being a Zepplin Air. Perhaps judging me on my appearance rather than track list he was pretty adamant the Air would be the best one for my requirements. But it murdered the bass, and not in a good way, for the paramore. All very cinematic maybe, but not to my liking. Some other tracks came out well, but the rCube was just so much nicer on the driving tracks, and had great texture for the string quartet. Ah-ha, I thought - it'll be too clinical for the vocals and piano.. but it wasn't.. (that surprised me the most actually - I did an actual double take). The big stuff and dance stuff revealed a weakness in low-end presence though (not surprisingly).
Having detailed the feedback to the assistant the honest answer was that I wasn't going to find a system that matched my preferences until I possibly tripled my budget. So as with everything in life, it's a compromise, and the test tracks helped me compromise in the way that maximised my enjoyment. Which is quite likely to be different from someone else's enjoyment - I know some people who are more likely to go 'wow' when they hear something like the Air pump out some loud bass than they are hearing the texture of a string quartet. But that's their loss