They tend to look at things in subjective terms and often confuse musical terminology, Pace, Rhythm, Timing, Sound Stage, Dynamics, Spaciousness, etc.. for engineering possibilities and vernacular. This is a crude attempt by an amateur, myself, to bring these 2 worlds together. I won't go through all the buzzwords, just a couple to make my point.
Tonality can be looked at from many different metrics, but basically a polar response and impulse response will tell you what you need to know. Some people would argue for phase as well, but the science refutes that claim.
Look here for a good explanation what tonality is: http://en.wikipedia.org/wiki/Tonality
and you'll realize this is a music term, not a playback term. That said, wild polar responses will not have good tonality unless they somehow match the inaccuracies of the recording process--fat chance. If it does on one, it won't on any other. The recording process has as much to do with this as the playback. That's part of the reason why getting a polar response on a speaker is more useful than going and listening to your few favorite tracks. Rise time and Decay should also play a part and can be seen by looking at the impulse graph, CSD, wavelet, etc... With the impulse(s) and a polar plot, you'll have all that info.
What most people seem to describe as soundstage as far as I know mostly has to do with speaker placement and polar response. If you are shooting an even sound across your room and your room and your speakers are placed with the left on the left and the right on the right, away from the walls toed in, I can't see where you could go wrong. A narrow directivity should give you a better image where a wider, a better sense of space.
Spatiousness is another one of those touchy definitions. To me that's mostly reflection above the modal region (search for "Haas Effect" and the "psychoacoustics" post on this blog http://dtmblabber.blogspot.com/2010/12/psychoacoustics.html) and low level detail resolution if you're talking about what's actually contained in the recording. IOW if you want to hear the recording environment as picked up by the microphone and diluted or enhances through the process of production. Which means anything that interferes with that can have an impact. So from the loudspeaker standpoint, impulse, cabinet accelerometer CSD, now it even looks like capacitor vibration(so there may well be credence to more tweaks like God forbid, cables! Nothing has turned up there yet that I know of), and polar response will play into everything. In Dr. Toole's book there are studies that show wider dispersion adds to a sense of spaciousness. It seems rather intuitive. Also contralateral reflections play a role and subsequent elevations in IACC factor in. There are many things that can effect the low level resolution. I'd bet to some degree you can trace this all the way back to the source. This may be the most expensive, difficult and time consuming part to get to the "N"th degree. The room itself is also a large part of this.
Transient response is another one of those CSD, Wavelet, Impulse, polar. It's just rise time and decay. A CSD or Wavelet don't tell us a whole lot without the impulse response--ever really that I can think of.
Dynamics is another interesting topic that depends on wether you are talking physical or psychophysical. The best way to look at this graphically would go back to everything mentioned for transients, then also thermal capacity, power compression and efficiency.
I know, no publisher of specs is giving you this information, so knowing how to use it or think of it is of little use. In the end we are all left to guess. I wonder if informed guessing is better than uninformed. I bet anyone schooled in the issues at hand could do much better than I. Every time I read something new, I learn more and I'm betting any recording engineer, acoustician, transducer engineer, etc... could do a much better job than I just did.