Ben your 3-digit system has been hugely inspiring to make this system, so thanks to you! I hope to get this one ‘finalized’ before long so if you have any recommendations for changes please feel free to share.
I hadn’t noticed that either. That would be really nice for dates.
That one was below the belt
Per @Mnemoriam’s suggestion, the system has been updated to simplify the vowel-consonant syllables (500-999) so that those numbers can also be read left-to-right.
Thanks to Python, I think this will be easier than I thought. So, I guess I don’t have much of an excuse, do I?
I think I got the most important part of it already:
Given a sequence of phonemes (which, in turn, will be given by a simple lookup based on the number-phoneme assignments), I can already extract all the entries of the CMU Pronouncing Dictionary and sort them first based on “distance to start of word”, then based on length (I decided to calculate length based on the actual ARPABET phonemes list instead of the number of letters in the word, which I think makes most sense).
So, for a number corresponding to, say, the list [‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’] (which I know no single 4-digit number will yield, but just as a simple example), the program returns:
(0, ‘explore’, [‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’, ‘R’], 7)
(0, ‘explored’, [‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’, ‘R’, ‘D’], 8)
(0, ‘explorer’, [‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’, ‘R’, ‘ER0’], 8)
(0, ‘explores’, [‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’, ‘R’, ‘Z’], 8)
(0, ‘explorers’, [‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’, ‘R’, ‘ER0’, ‘Z’], 9)
(0, ‘exploring’, [‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’, ‘R’, ‘IH0’, ‘NG’], 9)
(0, ‘exploratory’, [‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’, ‘R’, ‘AH0’, ‘T’, ‘AO2’, ‘R’, ‘IY0’], 12)
(0, ‘exploravision’, [‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’, ‘R’, ‘AH0’, ‘V’, ‘IH2’, ‘ZH’, ‘AH0’, ‘N’], 13)
(2, ‘unexplored’, [‘AH2’, ‘N’, ‘IH0’, ‘K’, ‘S’, ‘P’, ‘L’, ‘AO1’, ‘R’, ‘D’], 10)
Note the beginning and ending numbers in each row. These are, respectively, the index (location) of the beginning of the desired sub-list within the full word-list of phonemes, and the length of the full word-list of phonemes. I let them there in the output just for your examination (know that any list starts at index=0 in Python).
Based on my understanding of the problem, what we need now is to:
1 - Hard-code your system-key table;
2 - Create a lookup function to allow retrieval based on numbers instead of list of phonemes;
3 - Make any necessary consolidation of sounds such as the “cot-caught merger (AA and AO)”;
4 - Disregard the lexical stress markers (0, 1, 2);
5 - Maybe add more dictionaries.
I think that’s a good start. I’ll call it a day today and wait for your considerations.
Mnemoriam, Thanks for looking into this further, this is a great start! Your sample with index numbers and phoneme counts looks right on target.
To help with the first three items on the to-do list, I’ve created a new tab in the system spreadsheet with an equivalent version of the system using the CMUPD edition of ARPABET. You’ll see that the most mergers occur in the list of italicized vowels.
The CMUPD doesn’t use the 2-letter ARPABET version of the schwa (AX for ə), and instead uses an unstressed version of the AH (uh) vowel: AH0. This is sensible because “ə” is pretty close to “uh”. This presents a special case for the script, however, because generally lexical stress markers should be ignored, as you stated in point #4. I suggest still ignoring the stress numbers for the AH vowel everywhere else in the system, and only singling out the AH0 vowel for the 2-digit numbers.
Regarding point 5 it would be great to add more dictionaries, and I would be particularly interested in adding a Spanish dictionary. But if we can get a script for English CMUPD up and running first, that would be most helpful!
Let me know if there’s more I can do to assist you.
The system has been updated with a significant reordering of consonant blends. It was easier than I expected to get approximate numbers for frequency of consonant blends in the CMU dictionary by simply using the web browser’s search function. Based on those numbers the consonant blends have been reassigned to give the more frequent blends higher priority. They are still ordered alphabetically within each section of the system key.
EDIT (again): Additional edits have been made to the above order, since basic searches for phoneme frequency don’t always result in the best phonemes for initial blends. As in former versions of this system, many of the -s and -z endings have been deprioritized to the “Other Finals” section because they are often associated with pluralized versions of words. In addition, care has been taken to avoid groupings of like consonant blends of like first letters spanning across sections of 10 as much as possible. This helps with intuitive pronunciation of the numbers and memorization of the system key. Additionally, all of the numbers in the 900s range now employ the special r endings, instead of a mixture of regular and r endings.
The first two assignments of the “Initial Blends” section are NOT part of the alphabetical order intentionally because “y” and “dh” (as in “this”) are not really consonant blends like the rest of the section. They fit more with the consonants of 00-19, so it’s logical for them to immediately follow that section as 20 and 21.
The ‘ng’ and ‘zh’ sounds are assigned to the ‘Other Finals’ section regardless of their frequency, because they are already part of the secondary consonant endings in the 4-digit section, and need only to be in the 2-digit section for the sake of being available as their own distinct sound.
To simplify the system, I’ve abandoned the idea of incorporating the schwa (ə) sound. The CMU dictionary considers the schwa to be an unstressed version of the ‘uh’ sound (AH0), which should be sufficient. This can be considered a merger like the cot-caught merger for the ‘ah’ sound. 3 initial consonant blends of low frequency (shr, skw, spl) have also been removed from the 50-69 range so that this section can be occupied completely by final blends. In reality the 50-69 is of little consequence to the rest of the system, and could be assigned to a myriad of sounds. In the absence of a better idea, the other final consonant blends can occupy this range.
2-digit numbers can be pronounced as the user deems appropriate. It may be easier to pronounce 00-49 with a slight vowel sound afterward because they are initial sounds, and conversely to pronounce 50-99 with a slight vowel sound beforehand because they are final sounds, but in reality they are just consonant sounds or phonemes without any vowels attached. It is also of note that the ‘st’ sound appears twice in the system. This is intentional since ‘st’ is both a very frequent initial and final consonant sound. Pronouncing them something like “əst” and “stə” can help differentiate them, and of course their images would be different.
The system is now complete! Mnemoriam has been of inestimable help by providing feedback along the way, and having now created a script for searching for any number in the CMU dictionary! This word-generating program is highly useful for finding word matches for any number, and particularly the 4-digit numbers. Anyone wishing to use the program can follow the instructions in the first post.
For my personal use I am now working to fully memorize the system. I’m using my existing system of images for 1 and 2-digit numbers based on number associations to memorize the associated phonemes for each number. For example 22 is “bl”, so I think of Louis Armstrong blasting his trumpet. 37 is “skr”, so I think of Amelia Earhart screaming over the loud sound of her plane’s engine.
I will be working on adding and reviewing images in an Anki deck every day. With 11,110 images to learn, it’s necessary to become quick at choosing images. This is where Mnemoriam’s word-generating script will be very helpful, as well as powerful SRS-type memorization tools like Anki. This Anki add-on is quite useful for pulling images from Google Image search without needing to take extra time to resize the image.
Hey Slate, this looks like a terrific idea! I’ve just skimmed through your post, but it’s honestly got me interested in memorizing your system. It seems dead useful, and I particularly like that words can be approximated.
I’m going to give it a go and try to learn a 3-digit system. Also, Mnemoriam, great work on the word generator! It’s a great program(I’ve looked at the source code and the fact that you call it easy frankly makes me feel inferior). If you don’t mind, could I attempt to add another dictionary if I find one? I doubt I’m going to find another prononciation dictionary that conviniently happens to be machine-readable, but I’d like your permission on the off-chance that I do.
You and Slate almost make me feel like a real programmer Really, the program is ugly and amateur as hell, but it allowed us to “solve” the problem in the limited time I had available, so I am glad to have been useful.
I don’t know if you are talking to me or Slate here, but I am pretty sure I can speak for him too on this matter. As far as the code is concerned, please, it is YOURS (and everybody else’s), so use it, modify it, share it, do whatever you want with no attribution of any sort needed (only don’t sell it, of course).
With respect to adding another dictionary, I am sure Slate will love the idea that someone would be willing to do that. I myself think the system deserves it, for sure. Feel free to contact me if you need any help, but, like I believe I’ve hinted above, I’ll be very busy in the near future to actually get my hands dirty. But it is my pleasure to help in any way if I can. I am quite positive that Python’s Natural Language Processing Toolkit (NLTK) has a number of other dictionaries available, including different languages, so I’d start there if I were you; that’s where I got the CMUDict, anyway.
Best of luck and welcome to the site!
Thanks nkp, glad you like it I would certainly welcome any contributions to making the system more accessible, and I’m glad Mnemoriam has already welcomed you to modify the code that he has written.
In devoting my attention now to learning and using this system with Anki, I’ve realized there’s a need for good flashcards. If you’re serious about learning the system, I would think you’d find a good set of Anki flashcards quite useful as well.
It’s simple enough to make a CSV file with the 11,110 numbers in one column and import that into Anki so that there is an Anki note (card) for every number. Adding an image to every one of those numbers within Anki is then a straightforward process of entering each number in the Word Generator to see matching words, and then performing a Google Image search of those words to choose an image. Images can be copy/pasted with resizing very quickly using the add-on I mentioned in post 15.
Adding TTS (text-to-speech) audio to the Anki flashcards is something I feel would greatly enhance their effectiveness. Since every number is uniquely and precisely pronounceable, becoming fluent in the Slate System is really like learning the vocabulary of a new language. In my experience of using Anki for foreign-language vocabulary, hearing the spoken audio for words greatly enhances recall. I feel now that this is a feature that should be all means be included in an Anki deck for this system for those who are serious about learning it.
There is an Anki plugin called AwesomeTTS for converting text to speech within Anki. However, in order to have accurate TTS pronunciation of each single-syllable number, I believe it will be necessary to feed the actual phonemes to a TTS engine. This page regarding Microsoft’s speech engine may be useful for understanding the required syntax for doing the TTS conversion.
What do you think?
AwesomeTTS sounds useful, I’ve installed it in Anki and I’m already working on assigning images to each number. I’m actually stuck in a conundrum there. Selecting an image for 3-digit and 4-digit numbers is simplicity itself. For example, 1235 conveniently becomes ‘hell’. However, I’m having difficulty choosing images for the single digits. 7 becomes ‘eye’ easily enough, but it’s difficult to choose images for the other 9 without ‘encroaching’ on other number-image assignments. 4 could become ‘arm’, but it would interfere with 945 which literally spells out arm. The lack of single vowel words makes it hard to choose appropriate(and memorable) ones.
I don’t know if I’m missing an obvious solution here, but any insight would be appreciated. (Just to clarify, I consider arbitrary assignment a last resort)
Yes, my solution for the issue you describe is to use another association-based system of images for 1 and 2-digit numbers. Since the assignments of this second system aren’t based on phonetics but rather historical information, there isn’t a problem of a second phonetic system interfering with the Slate phonetic system. It does, of course, require memorization of 110 images without the aid of phonetics, as well as practice in referring to them by their Slate pronunciation. For example 23 is an image of Michael Jordan, but is referred to as “br”. So I make a little mnemonic for myself connecting the word “bro” with MJ like a nickname that might be used for a basketballer. Since I use an image of Roger Federer for the number 6, and 6 is pronounced “ay”, I think of Roger Federer being a tennis ace to remember this association.
Beyond the 1 and 2-digit numbers there are still going to be a lot of numbers that don’t have a matching word in English. A full third or so of the numbers in this system don’t have a match based on the preliminary analysis that Mnemoriam did when he wrote the word-generating code. For this reason I would only recommend this system to someone who does want to learn all the 4-digit numbers in addition to 1, 2 and 3-digit numbers.
Thank you for the solution. I think I’ll reuse my 110 Major images for the 1 digit and 2 digit numbers, barring any clashes with the assignments in the other numbers.
It’s been over a year since the last message. Has anyone tried this system?
I’m non-native English speaker and a beginner mnemonist, so I would greatly appreciate a brief tutorial with examples of usage.
Rebumping this thread… Has anyone implemented this system and are they seeing significantly better results in practice? In theory, theory and practice are the same. In practice…
I love the thought put into this system but I wonder if anyone has built out the images, memorized it for use and shown that it outperforms the current systems.
I’ve found very helpful tool online for searching words containing certain sounds: https://lessonpix.com/SoundFinder
Great compliment for awesome slate script.
Kudos on a well-turned phrase! I kind of want that on a t-shirt.
An update: I’ve memorized all number/sounds/words/images up to 99 and working on the rest up to 9999, but I will take a few years I suppose It really doesn’t matter, I’m having a good time with this
The main challenge is to find words that convert nicely to images I can attach to anki flashcards, especially in 4-digit range. This is the most time consuming activity in the process of learning this system.
One more helpful tool I’ve found lately is http://www.yougowords.com/ It’s searching engine is amazingly powerful.
@Numerosata Congrats on your progress with learning/using the system, and thanks for sharing updates with us!