Publishing in multiple languages and on two separate websites, North Korea’s state media service the Korean Central News Agency (KCNA) has recently been increasing its output to as many as 80 articles per day – in Korean, Chinese and English.
The site is regularly hacked by groups such as Anonymous who regularly slow down––but rarely disable––the Pyongyang-based domain. Hacking, however, can sometimes have a more productive use. A tool coded by New Zealander Frank Feinstein that archives and visualizes state propaganda has been making the otherwise daunting task of reading between the Pyongyangology lines significantly more easy.
KCNA Watch, Feinstein’s NK News-hosted system, collates all of the articles, images, and videos that the KCNA publishes in one, visualized, searchable feed. In short, the system brings together KCNA materials in English, Korean and Chinese, and allows users to graph trends, track issues and people, and create email alerts based on keywords featured in KCNA reporting.
Those familiar with the nuisances of reading materials on the KCNA’s Pyongyang-based .kp domain, will be well aware of some of the more interesting hurdles Feinstein has to overcome to make KCNA Watch function. It’s a website that’s routinely offline, where sharing direct links to articles is almost impossible, and for which accessing archived materials requires significant user experience.
By popular request, NK News interviewed Feinstein this week to learn more about the weird and wonderful world of Juche HTML and impossible-to-crack state media databases:
NK News: What does and does not make sense about KCNA’s websites, from a user perspective?
There are two websites that carry KCNA articles: kcna.kp in Pyongyang and kcna.co.jp in Japan. The Japanese site has been operating since late 1996 and carries only text based articles while the site in Pyongyang — now two years old — carries articles, video and images.
Both sites are out of step from what you’d expect from a state run news agency. The sites are also at opposite ends of the technology spectrum. The newest site, in Pyongyang, is also not compliant with the accepted norms of Western agency websites.
Compared to more reputable sites like BBC or CNN, both KCNA sites are light years behind. For instance, on KCNA there is no way to even view an article and its associated images and video in one place, let alone link to it. There are basically no benefits for the user by designing a site this way. User interaction is non-existent, and one-way communication mechanisms such as RSS feeds an email alerts do not exist.
NK News: Why do you think the KCNA sites were made in this way? How do they compare from a user perspective to other websites?
Both sites rank poorly in terms of usability. The Japanese site, because it is — and looks like — something straight out of the 90′s. It’s the way it is because the site hasn’t been redesigned to reflect changing trends.
The Pyongyang site is much more of an enigma. I would think it’s a fair judgement to conclude that it’s either been designed intentionally to be hard to archive, or it’s a casualty of inexperienced developers. It’s my view that the Pyongyang site is a serious, albeit misguided, attempt by KCNA to establish an legitimate online presence.
I can’t claim expertise in other North Korean web assets, but most are significantly better than KCNA.
NK News: Why do you think KCNA doesn’t allow sharing of articles?
There are two angles from which one can look at this. The first is that the Pyongyang website has over invested in Web 2.0 technology. This literally is a trade off between static content you can link to, and dynamically generated content that you can’t. This is illustrated well with a site like Gmail, which uses Web 2.0 for its intended purpose, because you’ll never need to share the link to an email in your Inbox folder with your friends. The Web went through a controversial period of overuse of Web 2.0 technology, and the inability to link to articles could simply be a casualty of this.
The other is basically control. A huge amount of content that makes it way into the world wide web is archived, forever. Google and Bing don’t even bother to index KCNA articles on the Pyongyang site because of its complexity, overuse of Web 2.0, and hidden forms. This gives KCNA more control than standard websites to retrospectively “manage” articles to fit their current propaganda model.
NK News: Do articles stay on KCNA’s .kp domain for ever? If they disappear, what is disappearing and why?
It needs to be stressed that the overwhelming majority — more than 99% — stay online forever. However, I’ve witnessed this “retrospective article management” happen on the odd occasion. Since we are now getting all our content from the site as they release it, KCNA Watch will release in the future a more detailed analysis of this phenomenon. It’s going to need a custom coded utility, and scanning their past archives for “deletes” involves having to fetch every old article again for purposes of comparison, it’s not something that really lends itself to live reporting.
NK News: What have been the most challenging technical hurdles in creating the KCNA Watch platform?
Monitoring news agencies is normally easy and requires no specialist skills or software – in fact it is encouraged and made easy to do so. But KCNA has been a unique challenge.
The Japanese site is easy to monitor, but it runs a day or two behind the Pyongyang site, which is the only site of interest when looking for new content.
Automatically scraping web 2.0 content is more challenging than static content, as I have to generate “fake” pages on the fly that mimic what would happen if a human actually clicked on an article in a browser. It’s akin to “tricking” their server to accepting that the requested links are as a result of human interaction internally on their website. They mix up their methodology – most pages are actually parsed as hidden forms with “postdata” – like when you make a credit card purchase, you don’t want your card number to appear in the URL of the processing form, They also use a little Ajax, but the whole site is set up to essentially “mimic” using Ajax. It’s quite cleverly done.
The real challenge however has been monitoring the agency at the same time it was under attack by Anonymous. As KCNA Watch requests web pages far more often than an average user, it would find itself banned, often within minutes. This has forced me to adopt a rolling proxy system, which can handle a rotating roster of high-anonymity proxy servers. It’s become the only way to effectively monitor KCNA in real time. I would guess that I’m the individual KCNA has issued the highest number of IP bans against.
NK News: What trends do you observe in KCNAs online output?
In the last 2 years, KCNA has significantly increased up its output. This includes the frequency of updates and reporting. They have also started reporting in a much wider range of categories. This is really evident in graphs from 2012, where the number of North Korean related articles increased in volume, but decreased as a percentage of the total KCNA output. This is a direct consequence of its shift in focus to the international arena.
It’s my view that KCNA is trying to bring itself more inline with norms of other international agencies. New phrases and terminology also do “trend”. I haven’t noticed any change in their English translations though, it’s still quite stilted.
I have also found it personally interesting how KCNA has dealt with the public image of Kim Jong Un. They’re portraying him kin a very different way to Kim Jong Il, and have incrementally stepped up reporting of his activities.
NK News: How does KCNA’s Korean language output compare to English?
KCNA outputs roughly twice the amount in Korean compared to English. Nearly all important articles are translated, though.
Translation is sometimes erratic, and Korean articles sometimes contain more information than English ones. Sometimes it’s subtle differences, in others there’s blatant omissions.
Anyone serious about watching North Korea really needs to read the Korean KCNA articles first, if possible. There are also many instances of Korean texts being available before English translations.
NK News: What is the most advanced feature of KCNA Watch?
KCNA Watch has many specialist tools which are valuable to professionals and the general public. However the most important function of KCNA Watch is to do something that KCNA should have done in the first place. Which is simply being able to view the new KCNA articles with their associated images, video and language translations, all on one page.
NK News: What are plans for KCNA Watch?
The primary function of KCNA Watch is to monitor, index and catalogue KCNA output.
Once it can do this to an acceptable level, I’ll start looking at other North Korean media, such as editorial newspapers. There’s still a way to go with KCNA first, though. I’m currently focusing on images and video. I’m also working on ways to automatically group articles of similar content together.
Click here to try KCNA Watch (non-NK News Pro members restricted to ten queries)
Interview conducted by Chad O’Carroll
taken from: http://www.nknews.org/2013/07/how-to-hack-into-north-korean-state-media/