That word was not found 😞. Please try another, or change the character set in the menu.
To get started, click any hanzi or word in the diagram. You can search for hanzi, Chinese words, or English words.
You're viewing Simplified characters. You can choose from Simplified Chinese, Traditional Chinese, Cantonese, or the HSK wordlist in the menu in the upper left. Or check out the Japanese version.
Just interested in how characters are composed? Check out the components tool.
This is free and open source software. Check out the code on GitHub.
Most example sentences are human-written, but many are AI-generated. File an issue on GitHub if you see anything weird.
Percentages are hanzi seen, not words. Click a bar in the chart for details.
Click a box in the calendar for details. Brighter colors mean more studying.
Click a box in the calendar for details. Brighter colors mean more cards added.
Click a bar in the chart for details.
Green: 75% correct or better. Blue: between 50% and 75%. Orange: between 25% and 50%. Red: less than 25% correct.
The idea is to emphasize the word-forming connections among hanzi to help learners remember them. I've found this more fun and effective than other methods, like studying stroke order, learning radicals or components, writing each character out 100 times, or doing spaced repetition on cards mapping hanzi to pinyin and English.
The site is a progressive web app. This means it uses modern browser APIs to make an installable app. Follow the directions for your platform to install it. A truly native app downloadable from the app stores may be a future work item.
Definitions and pinyin transcriptions of individual words were pulled from CEDICT, which releases data
CC BY-SA 4.0.
some of the files in
data should be considered released under that same
That depends on which character set you choose. The simplified and traditional choices should include everything present in CEDICT. Cantonese should also include everything in the CC-Canto project. The HSK set should have all the old HSK 2.0 words and characters. Ping on github with any issues. More examples and definitions will be added in the future.
When you add words to your study list, they will be presented to you as flashcards. You'll be shown the sentence and asked what it means; click "Show Answer" to see how tatoeba translated it. When you click "I didn't know that", the card will be added back to the end of your to-study list. When you click "I knew that!", it will be shown one day later, then two days if you get it right again, then four, and so on. It is meant to be a very, very basic spaced repetition system.
The export button downloads a file that can be imported into a different (better) spaced repetition system, like Anki.
All data for the site is stored in localStorage. It does not leave your browser, and clearing your browser data will clear it.
As you search, click, or tap hanzi or connections in the diagram, you are shown example sentences. Then, when you add words to your study list, the examples are converted to flashcards.
This section indicates how many times you've viewed examples for each of the characters in a given word, and how many cards contain those characters. The numbers are based on how things were when you viewed the examples, so if it's your first time seeing examples for a character, it'll say seen 0 times.
In most languages, there are some words that are used much more frequently than others. If you learn those words first, you'll be able to understand more of what you hear and read than if you'd start with less-common words. With Chinese, the same is true of characters: the most common ones are used an outsize proportion of the time, and they are the best ones to start with.
HanziGraph tries to help learners know how important a word is via color-coding in the diagrams and by surfacing raw frequency stats alongside the definitions and examples. This way, learners can concentrate on words that provide the biggest 'bang for your buck', so to speak.
Both word and character frequency data is based on analysis of millions of lines of subtitles, wikipedia articles, UN declarations, and website text. Particularly for words, the subtitles are given priority, since they tend to be more colloquial.
The flow diagrams are Sankey diagrams. They were generated by analyzing which words are most commonly used before and after the search term. Specifically, the top collocations of length 2 and 3 are shown. You can read the diagram itself from left to right, with taller bars meaning a word was more commonly used. The analysis was done on movie and TV subtitles, so (in theory) the diagram represents colloquial speech. You can click any of the words to see examples for it, much like the graph diagram.