DeepSeek technique to improve AI’s ability to ‘read’ long texts questioned by new research

AI models face a critical limitation known as the long-context bottleneck, which restricts their ability to process lengthy documents

Reading Time:2 minutes

The DeepSeek-OCR technique handles large and complex documents by using visual perception as a compression medium. Photo: Reuters

Published: 11:42am, 20 Jan 2026

A group of researchers from China and Japan has challenged a method unveiled several months ago by Chinese artificial intelligence start-up DeepSeek that was designed to improve AI’s ability to handle long blocks of text, marking a rare case of the company’s research being publicly questioned.

The DeepSeek-OCR (optical character recognition) method, designed to compress text by using visual representations, potentially revolutionising how AI models handle long texts, was flawed due to inconsistent performance, according to researchers from Japan’s Tohoku University and the Chinese Academy of Sciences.

In their study, titled “Visual Merit or Linguistic Crutch? A Close Look at DeepSeek-OCR”, the research team found that the start-up’s method relied heavily on language priors – the tendency of AI models to draw on patterns learned from large volumes of text – rather than the visual understanding it claimed, making performance metrics reported by the Chinese company “misleading”.

AI models faced a critical limitation known as the long-context bottleneck, which restricted their ability to process lengthy documents or extended conversations, the researchers noted.

Improvements in this area, which would lead to a performance leap in an AI system, have been sought by companies and research institutes worldwide.

A DeepSeek display at an AI fair in Hangzhou, east China’s Zhejiang Province, May 4, 2025. Photo: Xinhua

The DeepSeek-OCR technique, published in October, was said to be able to handle large and complex documents by using visual perception as a compression medium.

Select Voice

Choose your listening speed

Get through articles 2x faster

1.25x

250 WPM

Slow

Average

Fast

00:0000:00

1.25x

DeepSeek technique to improve AI’s ability to ‘read’ long texts questioned by new research

.css-1c6uqr6{color:inherit;font-weight:inherit;font-size:inherit;font-family:inherit;line-height:inherit;overflow-wrap:break-word;}AI models face a critical limitation known as the long-context bottleneck, which restricts their ability to process lengthy documents

AI models face a critical limitation known as the long-context bottleneck, which restricts their ability to process lengthy documents