Report

Understanding Chinese Tech in AI Research

Dewey Insights - China Series

8
min read
Download PDF

/

Author

Authors

Kasey Luo
Plaintext Group
Research Intern
Frank Long
Plaintext Group
Research Lead

Individual Endorsers

No items found.

All individual endorsers participated in their personal capacity. This report was prepared independently from any political or governmental entity. While the report generally reflects the observations, insights and recommendations of the endorsers, it is not the case that every endorser will agree with everything expressed herein.

Executive Summary

Copy Exec Sum to Clipboard
Copy Exec Sum to Clipboard
Chinese tech giants do not currently lead in global artificial intelligence (AI) research:
Chinese big tech companies (Baidu, Alibaba, Tencent = BAT) trail Microsoft and Google in top AI research paper count. However, with increased R&D spending, BAT could catch up in the near future. 
But Chinese AI technology startups are noteworthy:
Smaller Chinese AI unicorns (SenseTime and Megvii) produce disproportionately large amounts of research, especially in computer vision, when scaled by estimated R&D expenditure.
China will try to close the R&D gap with the United States:
An orchestrated effort from China to close the AI R&D gap from both tech giants and unicorns should be expected, as there’s evidence that China cares about publishing in top AI conferences at the government-, company-, and individual-level. 

Background

What’s going on? 

  • Artificial intelligence (AI) is a top priority for China: In 2017, China announced plans to be a world leader in AI by 2030. Since then, China has been rapidly accelerating towards this goal, dedicating significant financial and infrastructure support towards promoting AI.
  • China is playing catch-up in AI research and innovation: While China has the advantage in applying and executing AI technologies, the country is still behind the US in shaping core AI technology (<rte-link>Source<rte-link>).  
  • There is interest in Chinese tech’s role in AI research, as abundant AI research is a key driver for sustained leadership in AI: The authors perform a data-driven analysis about Chinese commercial AI companies and their role in driving AI R&D
  • Paper count from top international AI conferences is used as an indicator for novel AI research development in this paper.
Top international AI conferences: AAAI, CVPR, ICLR, ICML, and NeurIPS were selected as a quality filter.
Novel: cutting-edge approaches in AI algorithms, applications, or systems (vs. applying existing technologies)   

What does China’s current commercial AI landscape look like? 

China’s commercial AI landscape can be segmented into 4 layers:

  1. Foundational: integrated circuits, sensors, and middleware; cloud computing and data platforms (e.g., Cambricon, RoboSense)
  2. Technology: foundational algorithms in computer vision, intelligent speech, natural language processing, and other core AI technologies (e.g., Sensetime, Megvii, iFlyTek) 
  3. Application: applied AI to specific industries such as healthcare, drones, autonomous driving, education, finance, intelligent robots, etc. (e.g., DJI, Meitu) This layer currently makes up the largest proportion of China’s commercial AI landscape, in terms of the sizes and number of companies, indicating a top-heavy ecosystem (<rte-link>Source<rte-link>).
  4. Vertical Integrators: companies that work across all three layers (e.g., BAT = Baidu, Alibaba, Tencent) 
These enterprises are small in quantity, but huge in size and influence over the AI ecosystem. They are referred to as “tech giants” in the rest of the paper.

Insight #1 : Chinese tech giants are not global leaders in novel AI research.

Google and Microsoft are the clear global leaders in AI research output, with BAT contribution to novel AI being less than one-third than either that of Google or Microsoft. We see that Google and Microsoft contribute to well over 750 papers, but all the BAT output combined is significantly less than that.

Why is there a publishing gap?

On the surface, it appears that Chinese tech giants have a key ingredient to AI innovation: easy access to huge swaths of user data. Having larger amounts of data leads to more robust algorithms and more applicable outcomes. Chinese citizens shop, pay, communicate, and play largely through mobile applications created by Chinese tech giants, leaving behind a massive data footprint for companies to develop novel AI models and techniques on. Given this key advantage, more comparable R&D outputs between American and Chinese tech giants would be expected. Below are possible hypotheses that could explain the current R&D gap:

Hypothesis #1: Chinese big tech’s core capabilities and incentives are in application, not research. [likely]

  • Capabilities: access to a huge consumer market incentivizes companies like BAT to prioritize rapid adoption of AI technology rather than fundamental development of the technology itself. 
BAT dominates user engagement on mobile platforms: nearly 60% of time users spent on mobile devices in China is on a BAT app (<rte-link>Source<rte-link>) → applied AI can significantly enhance user engagement and user experience, and BAT’s huge consumer market serves as a testbed for these technologies
Less R&D spend: Tencent’s and Alibaba’s R&D expenses in 2019 made up <5% of their revenue. In contrast, American tech giants such as Google, Microsoft, and Facebook all had R&D expenses that were >10% of their annual revenue.
  • Incentives: government ties and intense competition incentivizes an application-focused AI strategy. 
Government-driven programs: The Chinese Science Ministry announced in 2018 that the nation’s first wave of open AI platforms will rely heavily on Baidu for autonomous driving, Tencent for AI in healthcare, and Alibaba for smart cities (<rte-link>Source<rte-link>). All of these programs necessitate applied AI.  
Strategic Investments: BAT challenges one another across a wide range of applied-AI industries ranging from transportation, education, autonomous driving, finance, to more. Together they invest in 53% of China’s 190 major AI companies in these verticals (<rte-link>Source<rte-link>). These investments further promote an application-focused approach to AI. 

Hypothesis #2: Chinese tech giants are conducting classified research. [possible]

  • Public-private partnerships: the appointment of Baidu, Alibaba, Tencent, and iFlyTek to China’s national AI team (<rte-link>Source<rte-link>) demonstrate high levels of collaboration between the public and private sector, which could be conducive to classified development. 
  • Government research centers: China’s National Innovation Institute of Defense Technology (NIIDT) has created two rapidly growing research organizations focusing on military AI: Unmanned Systems Research Center (USRC) and the Artificial Intelligence Research Center (AIRC). If classified research is happening, the authors believe it would be happening at these research centers (<rte-link>Source<rte-link>).

Hypothesis #3: China does not care about demonstrating AI leadership through publishing in top AI conferences. [unlikely]

  • Chinese government cares: China’s 2017 national <rte-link>AI Development Plan<rte-link> explicitly references the number of published papers and patents as a metric for progress and claims that China is second in this regard.
  • Chinese AI companies care about presence at AI conferences: Chinese unicorns Sensetime and Megvii were Diamond (highest tier) sponsors of CVPR in 2019 alongside Microsoft (<rte-link>Source<rte-link>). 
  • Publishing is a motivator at an individual career level: Personal websites of Chinese research scientists tout publications into top AI conferences as an accomplishment. Example <rte-link>here<rte-link>.

Insight #2 : Chinese AI technology unicorns are notable when research output is scaled by R&D expenditure. 

As seen from the graph above, the proportion of SenseTime and Megvii’s R&D expenditure that leads to published AI research far exceeds BAT and even American tech giants. This finding remains consistent when research output is scaled by company revenue and size as well. 

This characteristic can be defined as high AI research concentration.

Why are they notable?

It is worthwhile to note that there are no apparent US equivalents that share all of these characteristics below. 

SenseTime and Megvii are both AI technology startups 

  • Capabilities: Both SenseTime and Megvii are pure-play AI companies: their capabilities are in core computer vision technologies (face, image, object and text recognition; image and video analysis; remote sensing). They’re likely to continue generating novel AI research because advancing core AI technologies advance the underpinnings of all their products.
  • Lean Size: While exact headcount data is difficult to obtain, both SenseTime and Megvii are estimated to have between 1000-3000 employees (<rte-link>SenseTime<rte-link>, <rte-link>Megvii<rte-link>). In contrast to the 50,000+ employees at BAT and US tech giants, these unicorns’ lean sizes make their high AI research concentration more formidable. 

Reflection of China’s Broader Focus in Computer Vision: 

  • Out of the 5 top AI conferences analyzed, the majority of SenseTime and Megvii’s published research is in CVPR (Computer Vision and Pattern Recognition), a top international AI conference on computer vision (CV). 
  • Their disproportionately large contribution to CV speaks to China’s larger interest and focus in CV: a recent study by CSET found that China has recently surpassed the United States in total number of computer vision patent applications and granted patents (<rte-link>Source<rte-link>). 

High Levels of Institutional Collaboration: 

Research Collaboration:

  • According to our analysis, Sensetime collaborated on 119 top AI papers between 2013-2019 (generating only 10 papers independently). Many of these collaborative papers are published with SenseTime’s joint research laboratories with academic institutions (e.g., Shenzhen Institutes of Advanced Technology-SenseTime Joint Lab, Chinese University of Hong Kong-SenseTime Joint Lab).
  • This is comparable to the 166 papers Tencent collaborated on during that same time period, even while Tencent is more than 10x Sensetime’s size.  

Government Support:

  • Megvii’s <rte-link>2019 IPO prospectu<rte-link>s reveals that the Chinese government contributed substantial financial support to the company. For example, in 2017, Megvii received RMB63.3 million in government grants - for context, this was 20% of Megvii’s revenue that year. 

Abundant Cross-Vertical Applications 

SenseTime and Megvii service their core AI technologies to a huge range of verticals. They are thus incentivized to develop highly generalizable technology. Industries they’re involved in: 

  • Surveillance: help police catch criminals with facial recognition tech 
  • Social media: SenseTime-Meitu collaboration to allow users to adjust appearance (ex: face slimming) 
  • Health: Megvii’s intelligent temperature measurement systems are already adopted across China to help curb the spread of coronavirus, have been deployed in Japan and the Middle East this year (<rte-link>Source<rte-link>)
  • Self-Driving Cars: Sensetime open research facility in Japan

What’s next for Chinese commercial AI R&D? 

Chinese tech giants are likely to close the publishing gap with increased R&D expenditure.

  • While Chinese tech giants (BAT) current capabilities and incentives position them to focus on AI-applications, increased R&D efforts are likely as the nation mobilizes a unified effort to increase R&D in AI.
  • When scaled by R&D expenditure, BAT’s AI research output is comparable to American big tech. With increased R&D resourcing, it’s reasonable to assume that BAT could catch up.
  • Our analysis found that BAT more than tripled the amount of AI papers accepted to top AI conferences over the past 5 years. This growth will continue as China pours more resources into AI development over the next 10 years.

More Chinese AI Unicorns 

Chinese AI Technology startups (SenseTime + Megvii) will continue to produce cutting-edge AI research. It’s also likely that there’ll be more Sensetime + Megvii equivalents in the future due to:

  • Continued government support: China’s AI Development Plan outlines clear ways the government will incentivize these types of organizations. They plan to “put in place financial and tax preferential policies for small and medium sized and start-up enterprises” and implement “weighted deduction for R&D” entities (<rte-link>Source<rte-link>
  • Increasing global investment in Chinese AI startups: In 2017, China’s AI startup scene received 48% of funding going to AI startups globally, surpassing the equity funding share of US AI startups (38% of the global share) (<rte-link>Source<rte-link>)

Why does this matter?

  • Contrary to popular belief, BAT is not as strong in AI research as expected. Pure-play AI startups such as SenseTime and Megvii are notable.
  • When evaluating the formidability of Chinese AI companies in research, AI research concentration (paper count normalized by R&D expenses, revenue, or company size) should be used as well as raw paper count to identify smaller, prolific contributors.
  • China’s increased investment in AI R&D could be a double edged sword:
Could be good for global AI innovation since research published in top AI conferences is available for all to peruse. 
Could set China further apart as a formidable leader (particularly in computer vision), especially if academic decoupling occurs. 

Acknowledgements 

Thank you to Kyle McEneaney, Chris Kirchhoff, Zoe Weinberg, Jasmine Sun, Kumar Garg, Nick Rose, Sam Ching, Jordan Blashek, and Cassie Crockett for your insights and input.