Dong Chinese has been under development for a while now, so I thought I would tell the story behind how and why I started building it.
Why I started learning Chinese
I was born in Jinan, China. Both of my parents are American, and at the time my father was a professor at Shandong University.
My whole family moved back to the United States when I was still a 6-month-old infant. Growing up I didn’t speak any Chinese. The extent of my knowledge in Chinese that I learned from my parents was my name, how to count from 1 to 100, and a children’s song about two strange tigers with various missing body parts.
I always had felt that since I was a Chinese baby, I should know more about China and be able to speak Chinese. Before my last semester of university, when I was trying to figure out what my future career would look like, I decided that I would like to return to China once I graduated. My parents got in touch with their close Chinese friends that they had known from 20 years earlier. Their son, who was once an 11-year-old boy learning English from my father, was now a grown man with a Ph. D. working at a research and development laboratory at a large tech company.
He invited me to come to China, live in his apartment, and start working at his company. I excitedly accepted his offer. To get prepared for moving to China, I enrolled in a beginner Chinese course for my last semester of university.
How I learned Chinese
At university I studied fairly diligently. Since I was planning to move to China in a few months, I had more pressing motivation than most of the other students, and I also had more time since I only had two other classes I was required to take that semester. I did all of the assignments that the professor suggested, and I even had extra private lessons with tutors a few times a week.
Even so, there’s only so much one can learn in a semester of university classes. When I arrived in China that summer I still felt like a stupid American, unable to converse with anyone in Chinese beyond a simple introduction. Most of my communication was with people who could speak English.
Every day I rode a bus to work for 40 minutes each way. Since there was nothing else to do while sitting on the bus, I studied vocabulary flashcards on Pleco and Skritter. During lunch break I often chatted with my coworkers, mostly in English, but with a few tidbits of Chinese here and there. Over the months I gradually improved, but after 6 months of living in China I still didn’t feel particularly confident in conversation.
At the beginning of 2016, after I had been in China for six months and not being satisfied with my level, I decided that I would try to not speak any English for one month, except sometimes at work when necessary. This didn’t immediately change my level, but it did quickly change my habits and level of confidence. Doing this challenge for a month forced me to respond in Chinese even if someone spoke to me in English. When strangers spoke to me in English, I told them in Chinese that I was Russian and didn’t understand English.
Two things changed for me after this one month challenge. Firstly, instead of feeling nervous or embarrassed about speaking Chinese, I started to see it as just a usual part of my life. Secondly, it forced me into the habit of using Chinese whenever possible, instead of using English whenever possible. I don’t have any objective measure of my progress before and after this point, but subjectively it felt that my rate of progress increased significantly.
The biggest challenge in learning Chinese for most people, including myself, is learning to read characters. I could pick up spoken when I lived in China just from talking to people, but learning to read required a more focused approach. I studied characters using a spaced-repetition flashcard system of the HSK vocabulary list, as many people do. Over the course of about 6-8 months I learned to recognize about 1500 characters this way. It definitely helped me, but over time it became a more unpleasant chore of trying to remember a motley group of low-frequency words out of context. I eventually gave up flashcards after I came back from a two-week trip and saw that over 1000 words had accumulated in my review queue.
Writing a book
I was always looking around for resources for learning to read Chinese characters. At some point I came across the book Remembering the Hanzi by James Heisig, which presents mnemonics to remember a keyword associated with each character.
This book became fairly popular and many people have found it to be helpful, but my first thought after reading through it was “I bet I can write an even better book.” And so I started writing one. The goals I had for writing this book were all reactions to the shortcomings I felt were in Remembering the Hanzi.
Show characters in context
The flashcard system I used and the Heisig method both present characters devoid of context. This is not conducive to having a functional vocabulary, since memorizing lots of isolated characters does not guarantee that you will be able to understand or produce real sentences with them.
On top of that, there are many compound words in Chinese whose meanings are difficult to deduce from their individual characters. For example, if you know the characters 水 (water) and 母 (mother), it doesn’t mean you will understand the meaning of 水母 (jellyfish).
In my book, after I introduced each character, I showed the most common compound words containing this character, and at least 10 example sentences containing the character, so that you could understand what it actually means in context.
The mnemonics in Remembering the Hanzi seemed unnecessarily complicated and far-fetched, especially for characters that had a simple real explanation. For example, this is how Heisig explains the character 汁 (zhī; juice), which is composed of 氵(shuǐ; water) and 十 (shí; ten):
This is not just any ordinary juice, but a brew of water and needles. Its distinctively sharp taste cuts your thirst by distracting it with excruciating pain as the juice passes down your throat.
The real explanation of 汁 is that it, like most Chinese characters, is a phonosemantic compound: juice (汁) is made of water (氵), and zhī (汁) sounds similar to shí (十).
There may be some merit to wacky explanations like Heisig’s, since humorous associations can be helpful for memorization, and I have even invented a number of similar mnemonics to aid my own learning. However, mnemonics are often somewhat personal and not effective for a wide audience; associations that make sense in my own mind can be rather uninspiring for other people who don’t think exactly the way I do. Similarly, for me personally, the story above about a sharp brew of water and needle juice is unsatisfying and more difficult than memorizing the combination of 氵 and 十 by rote.
Instead of creating my own system of mnemonics, I decided that I would write a book that would explain the real origins of Chinese characters. With that goal in mind, I started researching the origins of the 1000 most common Chinese characters.
Order by frequency
Remembering the Hanzi lists characters in an order based on their graphical structure. For example, the three characters 口 日 月 are introduced, followed by characters that are composed of different combinations of these three characters, such as 朋, 明, 唱, 晶, 品, and 昌. This seems logical, and indeed such groupings may be helpful for memorization.
However, a disadvantage of this approach is that the frequency of characters is not correlated with their structure. For example, 昌 is a structurally simple but relatively uncommon character: 昌 is the 25th character in Heisig’s order, but is 1606th according to frequency in Chinese books, and even less frequent in spoken language, occurring in fewer than 2% of Chinese movies. On the flip side, the character 那, which is the 1476th character in Heisig’s order (perhaps because it contains the uncommon 冄 component on the left side), is the 38th most common word in Chinese books and occurs in over 99.9% of Chinese movies.
I assume that most learners prefer what they learn to be immediately useful and relevant, so I ordered the characters according to their frequency. The most common characters were towards the beginning, and the least common characters were towards the end. However, I made some modifications to strict frequency order:
- Some characters were placed earlier than they would appear by frequency critera to allow more sentences to be produced. To understand the reasoning behind this, consider the seven most common words in English: ‘the’, ‘of’, ‘and’, ‘a’, ‘to’, ‘in’, and ‘is’. After learning these seven words, a learner would not be able to understand or produce any complete English sentences. We could replace some entries with less common words that carry more meaning to form a more useful list: ‘a’, ‘is’, ‘he’, ‘has’, ‘good’, ‘person’, and ‘thing’. This is still an extremely limited set of vocabulary, but at least there are quite a few complete sentences that can be constructed with it, such as “He is a good person”, “A good person has a good thing.”, “Is a person a thing?”, and so forth.
- The number of different media a character appeared in (i.e. its contextual diversity) had more influence on the order in the book than frequency. For example, a character that appears in a phrase used one time in 100 different movies would be ranked higher than a character that is used 100 times in a single movie.
- The characters for the numbers from 1 to 10 were listed all together in their natural order, even though the character 一 (one) is more frequent than 九 (nine) by several orders of magnitude.
So where’s this book?
I never finished it. This book is now in my big pile of unfinished projects. It turns out that writing a book takes a lot of time and hard work.
I left it untouched for a while, always meaning to come back to finishing it, but then I thought, “I’m a software developer. I can make an app. Anyways, an app would probably be more useful and popular than a book written by some no-name guy that happened to live in China for a little while.”
Creating an app
The basic ideas I had for the app were the same as the basic ideas I had for the book: show characters in context, give real explanations for their origins, and introduce them in frequency order. The advantage of making an app was that it would be more interactive than a book. The app format also naturally led to some features that would not have worked well in a book format.
Since Dong Chinese knows the answers you give to exercises, it can keep track of what you know well and what you don’t know well. It builds a model of your knowledge: which characters are brand new to you, which ones you are somewhat familiar with, and which ones you have already mastered. Given the model of what vocabulary you already know, Dong Chinese can then find real material that you can understand, including songs, videos, and podcasts, and estimate how well you will understand it.
Want to learn how to read and write Chinese? Sign up at Dong Chinese.