Standards

From My Wiki

Jump to: navigation, search

Contents

PO format

  • PO format is a de facto standard.
  • It is very simple bilingual format – it always tracks your source and target string which makes it easy to localise.
  • Its designed to localise software.


XLIFF

  • XLIFF is emerging as an industry standard for localisation which is trying to address the problems of proprietary software.
  • It has exciting potential; can also be used for translation memory.
  • Its designed to translate anything.
  • It has much better ability for tracking workflow information.
  • Can work with Drupal – e.g. you export the content to XLIFF and edit in an EXLIFF editor like 'Pootling' or any other XLIFF tool – then once complete merge it back.
  • Can contain multiple files that you are are translating; and allows you to incorporate comments and annotations from your translations

There are also some standards for converting from various formats into XLIFF.

New workflow standard is emerging; web services based standard; check with Dwayne what this is called

XLIFF and segmentation

SRX – developed by LISA (Localisation Industry Standards Authority). It doesn't do the segmentation but does enable you to define the segmentation and share it so that you know how the text was broken into segments. GSV – also developed by LISA. Only covers volumetric standards but will cover other areas such as quality in the future.

How the localisation industry counts content 'Normal' words Mark up e.g. HTML tags Non translatable things (brand names, variables)


TMX

Translation Memory Exchange. Allows you to exchange translation memory'. This would enable Dwayne to hand Anas a TMX file which would contain previous translations. Has two levels;

1.A level which says 'this is text and this is its' translation'

2.Looks at markup and refers to it – can be very useful.

TBX

'Term based exchange' – allows you to share your terminology. Standard for glossaries.


Difference between TBX and translation memory

TBX is a glossary of terminology – translation memory is a set of the previous translations.

Translation memory allows you to 'leverage' or reuse your old translations.


Questions and answers

Can you use these standards in dynamic online systems as opposed to working with a CMS? Dwayne – They have built a web environment, Pootle, to allow people to translate with PO or XLIFF files. You would link this system with a CMS to allow each to do their job well.

Brian – the most successful standards are de facto standards which solve a specific problem; such as SOAP and RSS. Standards which come out of W3C tend to be overengineered and overspecced.

Anas – any research on what is the most effective segmentation and translation memory? Dr B – a good translator can translate as quickly as it takes to review and fix a machine translation!!! But using their own Translation Memory would make them faster.

Anas - Shared translation memory would be really useful. Dr B – try and use the really good translators' translation memory as an influence.

MM - Open translation memory repositories – are there any? Dr B – someone has done this; has taken a load of translations and turned them into an open access repository. But this raises lots of legal issues about reusing content but is a great idea!

Dr B - Insights from the 'big league' Dr B The big players like IBM, Novell, Oracle have built their own systems. Brian – Apple have built their own systems.

Best practice and wish list Worldwidelexicon has the best interface – that tool tied in with servers that do the file interchange would be fantastic. This allows a combination of ad hoc translation and more formal tools based translation plus also the ability to quickly fix problems.

CMS issues

DR – CMS. You are very tied to the way that the people who designed the workflow within the CMS – its better to use two different systems and let the CMS manage and negotiate the content and the translation management system handle the translation.

BM – the best way to do the integration between your Translation management system and your CMS is through RSS. In RSS you have an item called GUID. They input the version number and the locale into which it was translated and the CMS can deal with this.

Plone can work multi lingually out of the box but when CMS' are advertised as being multilingual but that just means that they can actually run in different languages rather than supporting single site in muliple languages. E.g. when they say they are multilingual they often mean that the iterface can be translated, or that the ite can manage a language like Hindi, but they are never clear o whether a site can easily display and manage content that is in multiple languages at the same time.

Personal tools