Dream translation tool
From My Wiki
2007-11-30 14:50
Tomas: So wireless book was done with just a wiki and no contributions from the original authors.
yesterday we were talking about a PledgeBank to fund translation. its about translating documents rather than website and UIs, but there's certainly shared infrastructure.
So what would an ideal tool for both volunteer and paid participants and getting them to cross over.
so lets introduce everyone, so we know what everyone does. tools or translators?
I'm a developer who is sick of doing tools and now I'm a technical author.
so we'll put together a bunch of ideas, stick them upon a board and arrange them into groups. from my perspective, publishing source documents is a key feature. Also it must be able to important and export source documents
the pledge bank idea is what got us here. what is a PledgeBank? instead of a donation, you say you want to pledge $10 to the project - so unless you know you can fulfill the budget, no one pays. so you don't donate to something that doesn't happen. and its becoming very popular in various organisations, civic organisations in the UK especially. a non profit in tech in India might want to fund a translation, but they just want the Bengali translation, not a little bit of many languages. so one ideal feature would be the integration of PledgeBank features.
so i want to show what has been done and what needs to be done.
anas: yes, there needs to be an incentive structure
fran: there needs to be a "review process" and a reputation system too.
andy: when i manage a team of 50 people, you have a time line of production, you break it down into deliverables, and you chunk down the work, and traditionally you ask people how long each one of them will take and build it up like that. but I'd like to see tools that help people to do the job of a manager through their community. "progress tracking".
Dan: I want a glossary that fits all the words that appear in the source text
Elim: Version tracking, meta-data mark up - "here I had a problem" - at any level of detail. shared (on line) discussion around that markup?
?: Estimation - objective (80% done) and subjective (this is good stuff)
?: CMS plug ins to enable translation of websites
David: Translation Memory data, but unified, same with glossaries. access to external translation memories. There are lots of memories around, but if there was one central repository that would be useful.
?: work flow management - translating/editing/proofreading.
?: reputation system
?: bug tracking
?: segmentation of larger texts into translatable chunks
?: prioritising the segments
?: accounting - knowing who did what
?: PHP library functionality
Taco: if you have some version management system, and you change an original document, you have to have change tracking across translations.
Dan: when i translate into Serbo-Croatian, I'd like to be aware of what the other Baltic language translations have been, so I can see the other work people have done in very similar languages to speed me up. so a multi lingual comparison feature.
Elim: Its good to see the original source language, if you are doing 2nd order translation - as in, you have an English original text, and french source text, and a Spanish destination text. "alternate source text"
Tomas: keeping formatting across translations sounds very difficult
Fran: well we have stuff to do this at the moment
?: Translation of SVG graphics
Elim: License tracking is important too
?: You could have a non-free license and then put people on spam lists if they chose it ;)
Fran: Unicode support is key, convert to Unicode is important for dealing with existing translations
Taco: So this shouldn't be one big tool, it should be a suite of tools that all do one thing and one thing well
Dwayne: Syntax highlighting is a dream feature, that highlights
Andy: Engineering programming tools are so good, and picking out the structure of things, and maybe that could work well.
Tomas: Gobby is a real time collaborative text editor that color codes users edits. so different users could be marked with colour to help show where there might be inconsistencies of style.
?: RSS notification is key
?: Offline use is key
?: Export to PDF for printing is good
So, what are the roles?
- Original authors
- Translators
- Editors
- Proofreaders/reviewers
- Project managers
- Advocates and salesmen
- End Users
Now lets map these roles to each function we've described.
Import/Export Source Documents ALL Progress Tracking PCT Extract Glossary from any text T Version Tracking ALL Translation Markup/Commenting TER Estimation (Objective + Subjective) FPET Shared (on line) discussion TER Memory Bank TE Link into MT + Translation Memory TE PledgeBank/donations AFP Work flow Management (TEP work flow) ALL Change Tracking TERP Multilingual Comparison T Alternate Source Text T Publishing Static Versions P Role-based Accounts PAF Prioritising PE Bug tracking (users can report errors in translation that the proofreader missed) C Reputation systems (users can rate quality and thus reputation of translators) REC RSS feeds ALL
Maintaining layout across translations, and doing translation of SVGs, are really other programs. Everything should be Unicode aware, RSS feeds to track changes, offline use, export to PDF, and must be easy to install and maintain. CMS plug in (PHP/Ruby/Python/etc Library). Licence tracking.
So what tools have these?
Import/Export Source Documents: Translate Toolkit Progress Tracking: Pootle, DamnLies, Launchpad Extract Glossary from any text: Lingo Version Tracking: SVN Translation Markup/Commenting: GPLv3 Markup Tool Estimation (Objective + Subjective): POCount Shared (on line) discussion: IRC, mailing lists, etc Memory Bank: T.T. Link into MT + Translation Memory: Apertium PledgeBank/donations: PledgeBank Workflow Management (TEP workflow): Change Tracking: Multilingual Comparison: Alternate Source Text: Concordancer Publishing Static Versions: Adam's Maintain Layout across translatoins: Apertium SVG translations: See Andy Fitzsimon's work in Red Hat, presented at LGM07 Role-based Accounts: Most CMS have this Money Accounting: Segmentation: Prioritising: User problem reporting: Trac, Bugzilla, Jitterbug Translator Reputation system: Advogato, a Brazilian one RSS feeds: Most CMS do this
15:45
