21 April 2017

How a SF startup helped a Japanese publisher overcome its ‘text dilemma’

To explain how the solution came about, Akira Iwase of Shueisha Publishing and Stanley Chien of Kono spoke to FIPP contributor Felix Mago off stage at the recent Digital Innovators’ Summit 2017 in Berlin.

***Join FIPP for our next event: the iconic FIPP World Congress, taking place from 9-11 October 2017 in London. Discounted pre-agenda bookings are available until 30 April, with savings of £800 or more on eventual rates. More at fippcongress.com***

Shueisha Publishing found it nigh on impossible to convert Japanese PDF content to digitally compatible content, explains Akira Iwase. “Most Japanese magazines’ text is printed vertically. This makes it very difficult to convert pdf text into html. This slowed down our digital development. For example, if we wanted to publish in html, we had to manually convert text to a standard photo, extract the data from the pdf and manually convert this into html. It took a lot of time and was just as expensive.”

A similar problem existed for translating printed PDF text, even though some of Shueisha Publishing’s magazines, like the manga comics ‘Naruto’ and ‘One Piece’ were sought-after in the US and Europe. Likewise several of Shueisha’s fashion magazines were in demand because Japan’s fashion is considered a market-leader in Asia and presented a lucrative opportunity for translation and syndication.

As digital head of publishing this left Iwase with a major challenge in a market where the population is shrinking. Thankfully, Silicon Valley startup Kono came to his rescue. The company, founded in 2011 by Stanley Chien, started to develop automated technology to extract Japanese text from PDF to then be exported as html. Or in the words of Chien: “The technology we developed …extracts around 90 per cent of Japanese content out of PDF automatically using machine-learning algorithms.

“It can also identify subtitles and learn how to solve more complex language problems. This allows us to extract text (from PDF publications) and divide it into separate articles. After we have extracted it, we can reflow the content, so it’s much easier to read on mobile devices. And it’s automatically ‘html-ed’.”

Once this has taken place it is easier for automated translation into languages such as English and French to happen.

“We can do even more interesting things with the extracted text… such as introducing artificial intelligence for recommendation engines, similar to Netflix, but for magazines. Based on what the user has read previously and their user profile we can feed them with articles they may be interested in. We can offer these recommendations in all Asian languages. So, it not only extracts the content for republishing on mobile devices, we can also provide data and analytics for personalised recommendations.”

In a world where interest in Japan and Asia is growing, this technology creates large opportunities for Asian publishers, says Chien. He references a paid for fashion magazine app in Apple’s App store – literally translated as ‘Japanese Magazine’ – which became extremely popular in China but was reportedly a pirated version of a Japanese magazine. According to Chien, it briefly became the best selling app before it was identified as fake and taken down by Apple.

“This proves that there’s a large demand throughout Asia for Japanese content. So, I think there is a good opportunity for us to export the content. That’s why we’re working with Shueisha and other Japanese publishers to translate some of their content so that more people in Asia can consume their magazines.”

Chien adds that this is a “golden opportunity”, giving them the chance to work with a spectrum of Japanese publishers using Kono’s technology to extract content, to then translate that content into multiple languages. “We at Kono and other publishers across Japan will benefit from it.”

More like this

Shueisha general manager: Print audience fuels ecommerce model

Here’s what you missed at FIPP Asia-Pacific

Is native advertising about to ‘eat’ the Asia Pacific region?

US, China and Japan to drive recovery in luxury adspend

June 2020 update: Publishing in the times of pandemic

The way industries respond to a crisis determines how they will respond to a disaster in the future. The impact of Covid-19 on publishing has brought with it an existential crisis for many. It has also brought innovation and resilience.

17th Jun 2020

Insight News
Sustainable publishing: how media companies are tackling climate change head on

With coronavirus lockdowns around the world beginning to ease, the global public health conversation looks set to turn back towards climate change. Here, we look at how some of the world’s leading media companies are playing their part through sustainable practices.

18th Jun 2020

Features
30+ confirmed speakers and counting for online FIPP World Media Congress — see who they are

Former professional tennis player and now CEO and Chairman of the Executive Board of Ringier AG, Marc Walder, is among the now more than thirty speakers already confirmed for the 43rd FIPP World Congress.

24th Jun 2020

FIPP News
Crisis-driven pivots: How Covid-19 forced The Big Issue to accelerate change

Join us for a free, hour-long FIPP Insider webinar with Russell Blackman, MD of The Big Issue, on Thursday 25 June at 9 am ET, 2 pm BST, 3 pm CET and 9 pm HKT.

22nd Jun 2020

FIPP News