As I’ve worked on rebuilding my new card / number system I’ve actively tried to prioritize objects over people. For me, they’re easier to recall and animate and differentiate between. This is definitely a personal preference, but the following point applies however you structure your system and can also be applied when looking at multiple options for your “people” images.
What makes a good object (or person), in terms of efficiency and effectiveness within a system?
After building multiple 2 and 3-digit systems using various strategies, I feel like I have some insight into this question.
When I decided to change up my system, I re-evaluated all of my words and looked for opportunities to shorten them to single syllable ones wherever possible. This stems from the idea that when going fast, less syllables means less sub-vocalization used, which means you can go faster. This idea has been championed recently by @LociInTheSky with his new Trochee System.
The idea that less is more when it comes to sub-vocalization is true, up to a point.
Compare these two objects… both options for 170 and the equivalent card pair(s):


So we have TaXi and TuX.
Both are pretty distinct and unique images if taken in a vacuum. For the words themselves, “TuX” has the edge when it comes to syllables, but “TaXi” has the edge when it comes to vibrance, action, and utility and I think this is the more important attribute.
I can think of many ways to incorporate a TaXi into a mnemonic scene. It can crash into a loci, run over another representational element, pick up or drop off something, another element can be driving it or popping out of the trunk, it can burn rubber, it can swerve and jump…
A TuX… can be worn by something. Or maybe it could be brought to life as if a ghost was wearing it. That’s about all I can think of when it comes to making a TuX memorable or active.
So here we have a case where the TaXi is (for me) clearly the better choice, even though TuX is the “quicker” word. But get this… I can train myself to associate the image of the crazy TaXi with a shortened word… “TaX.” I’m not going to confuse it with an income tax bill or a tax collector… I’ll know for certain that speaking, reading, or sub-vocalizing “TaX” in the context of memorization will bring me the image of TaXi. So now, I have a great animatable image that I can also easily shorten to a one-syllable sub-vocalization if I need to. To be honest though, the amount of time it takes to sub-vocalize “TaXi” is nearly identical to the amount it takes for “TuX.” The “i” sound at the end of TaXi is almost just included in the breath after the hard X sound.
Even longer words or phrases like “TaXidermied animal” could be shortened to “TaX.”
So I’d encourage you, if you’re trying to optimize a system, to go for the best images that you can fit into a word or phrase and then look for ways to trim the sub-vocalization down to one or two syllables. You’ll have the best of both worlds: vibrant imagery and fast pronunciations.