-
Advertisement
DeepSeek
TechTech Trends

DeepSeek technique to improve AI’s ability to ‘read’ long texts questioned by new research

AI models face a critical limitation known as the long-context bottleneck, which restricts their ability to process lengthy documents

Reading Time:2 minutes
Why you can trust SCMP
1
The DeepSeek-OCR technique handles large and complex documents by using visual perception as a compression medium. Photo: Reuters
Ben Jiangin Beijing

A group of researchers from China and Japan has challenged a method unveiled several months ago by Chinese artificial intelligence start-up DeepSeek that was designed to improve AI’s ability to handle long blocks of text, marking a rare case of the company’s research being publicly questioned.

The DeepSeek-OCR (optical character recognition) method, designed to compress text by using visual representations, potentially revolutionising how AI models handle long texts, was flawed due to inconsistent performance, according to researchers from Japan’s Tohoku University and the Chinese Academy of Sciences.

In their study, titled “Visual Merit or Linguistic Crutch? A Close Look at DeepSeek-OCR”, the research team found that the start-up’s method relied heavily on language priors – the tendency of AI models to draw on patterns learned from large volumes of text – rather than the visual understanding it claimed, making performance metrics reported by the Chinese company “misleading”.

Advertisement

AI models faced a critical limitation known as the long-context bottleneck, which restricted their ability to process lengthy documents or extended conversations, the researchers noted.

Improvements in this area, which would lead to a performance leap in an AI system, have been sought by companies and research institutes worldwide.

A DeepSeek display at an AI fair in Hangzhou, east China’s Zhejiang Province, May 4, 2025. Photo: Xinhua
A DeepSeek display at an AI fair in Hangzhou, east China’s Zhejiang Province, May 4, 2025. Photo: Xinhua

The DeepSeek-OCR technique, published in October, was said to be able to handle large and complex documents by using visual perception as a compression medium.

Advertisement
Advertisement
Select Voice
Choose your listening speed
Get through articles 2x faster
1.25x
250 WPM
Slow
Average
Fast
1.25x