To explain how the solution came about, Akira Iwase of Shueisha Publishing and Stanley Chien of Kono spoke to FIPP contributor Felix Mago off stage at the recent Digital Innovators’ Summit 2017 in Berlin.
***Join FIPP for our next event: the iconic FIPP World Congress, taking place from 9-11 October 2017 in London. Discounted pre-agenda bookings are available until 30 April, with savings of £800 or more on eventual rates. More at fippcongress.com***
Shueisha Publishing found it nigh on impossible to convert Japanese PDF content to digitally compatible content, explains Akira Iwase. “Most Japanese magazines’ text is printed vertically. This makes it very difficult to convert pdf text into html. This slowed down our digital development. For example, if we wanted to publish in html, we had to manually convert text to a standard photo, extract the data from the pdf and manually convert this into html. It took a lot of time and was just as expensive.”
A similar problem existed for translating printed PDF text, even though some of Shueisha Publishing’s magazines, like the manga comics ‘Naruto’ and ‘One Piece’ were sought-after in the US and Europe. Likewise several of Shueisha’s fashion magazines were in demand because Japan’s fashion is considered a market-leader in Asia and presented a lucrative opportunity for translation and syndication.
As digital head of publishing this left Iwase with a major challenge in a market where the population is shrinking. Thankfully, Silicon Valley startup Kono came to his rescue. The company, founded in 2011 by Stanley Chien, started to develop automated technology to extract Japanese text from PDF to then be exported as html. Or in the words of Chien: “The technology we developed …extracts around 90 per cent of Japanese content out of PDF automatically using machine-learning algorithms.
“It can also identify subtitles and learn how to solve more complex language problems. This allows us to extract text (from PDF publications) and divide it into separate articles. After we have extracted it, we can reflow the content, so it’s much easier to read on mobile devices. And it’s automatically ‘html-ed’.”
Once this has taken place it is easier for automated translation into languages such as English and French to happen.
“We can do even more interesting things with the extracted text… such as introducing artificial intelligence for recommendation engines, similar to Netflix, but for magazines. Based on what the user has read previously and their user profile we can feed them with articles they may be interested in. We can offer these recommendations in all Asian languages. So, it not only extracts the content for republishing on mobile devices, we can also provide data and analytics for personalised recommendations.”
In a world where interest in Japan and Asia is growing, this technology creates large opportunities for Asian publishers, says Chien. He references a paid for fashion magazine app in Apple’s App store – literally translated as ‘Japanese Magazine’ – which became extremely popular in China but was reportedly a pirated version of a Japanese magazine. According to Chien, it briefly became the best selling app before it was identified as fake and taken down by Apple.
“This proves that there’s a large demand throughout Asia for Japanese content. So, I think there is a good opportunity for us to export the content. That’s why we’re working with Shueisha and other Japanese publishers to translate some of their content so that more people in Asia can consume their magazines.”
Chien adds that this is a “golden opportunity”, giving them the chance to work with a spectrum of Japanese publishers using Kono’s technology to extract content, to then translate that content into multiple languages. “We at Kono and other publishers across Japan will benefit from it.”
More like this