Chinese discourse has attracted considerable scholarly attention from communication and discourse researchers during the last decade. For discourse analysis, conspicuously neglected from the existing body of literature is the systematic investigation of multimodal semiotic resources. As multimodality has become the norm in most, if not all, spheres of communication, a large part of meaning is carried by other semiotic resources, such as visual images, facial expressions, gestures, and intonation. Consequently, the exclusive focus on language may lose essential aspects of meaning, and subsequent interpretations of communicative effects, cultural values and ideology may be partial or incorrect. As Kress (2003: 11) points out, ‘we cannot now hope to understand written texts by looking at the resources of writing alone. They must be looked at in the context of the choice of modes made, the modes which appear with writing, and even the context of which modes were not chosen’. Similarly, Jewitt (2009: 3) observes that ‘speech and writing no longer appear adequate in understanding representation and communication in a variety of fields’. Therefore, it can be argued that studying multimodal discourse is an imperative and logical next step for Chinese discourse research.