A Short and Sweet History of Translation at Hootsuite
Hootsuite has been a global company since 2012. As soon as we became global, localizing our products to offer more than English became a priority. We needed our dashboard to support many languages at once, efficiently. How do we keep the Hootsuite dashboard useful and personalized for all of our users across the world? Hootsuite has offices all over the world, from Bucharest to Vancouver, but our users’ global presence far exceeds our office representation. Hootsuite users come from every corner of the globe and speak dozens of different languages. It’s difficult to keep up with translation our constantly-evolving product. Currently, our dashboard is available in over 15 languages, with another 15 on the way! That’s a lot of languages for our developers to manage.
When we first realized that our products needed translating yesterday, there were some ill-fated false starts. One particularly unfortunate event involved a developer translating our iOS app into German, using Google translate. While the intent was genuine, our German customers let us know (strongly) that Google Translate is not yet advanced enough to be a substitute for human translation. After those first attempts to self-solve the problem, we did what Hootsuite does best: we built a better way.
The “Old” New System
Our first attempt at automating translations seemed like a great option. We integrated Hootsuite with Pootle and a professional, third party translation provider. Pootle is an open-source, online localization server that allows community translators to contribute to an ongoing translation project. The biggest benefit of both of these puzzle pieces? All our translations were theoretically completed by humans – goodbye Google Translate! We could submit strings for professional translation, which were then automatically merged with a pair of scripts that ran nightly. So far, things were looking up. It was certainly a vast improvement over no system at all.
How exactly did our automation run? Our translation is based on .po files, which is a commonly used, industry standard translation file. They work by tagging strings as key value pairs based on the English root string:
Using these .po files as a medium, we created two nightly scripts to update our translations. The first script updated Pootle with new strings, then updated our code base with new translations by merging together the relevant .po files. I won’t get into the details, because we eventually developed a new way to automate most of this process. The only part of this script that’s still standing is the link between our translation data repository, and the repository where our dashboard code lives.
The Old System
The second script compiles our translated strings into language packs that made the translations available to our frontend so they can actually be used on the dashboard. This is a shell script that is triggered by a jenkins job, and is still in use to this day. From the language packs, we can include the compiled translations on each page of hootsuite.com. This prevents us having to load everything all the time, and improves performance of our website. A language pack is simply a skin for localizing software.
This loop is where the new translations happen. It’s surprisingly easy to get translations into the language packs. We just translate each string into the appropriate language as we go, pulling the translations from our .po files. If no translation is found, the string defaults to staying English, so we never end up with missing strings in our dashboard. Following this, all that is left is to write the array to a file which acts as the language pack.
As with any new technology integration, the system was imperfect. At first, we had no monitoring of the translation jobs. One of the major problems was how much of our translation still wasn’t automatic. And not only was it not automated, but it was tricky and technical. In order to add any new translations to our project, a developer needed to go in and manually merge in a newly translated .po file using
msgmerge (a GNU get text tool that merges two uniform .po files together). This is, to the best of my knowledge, the only command line tool for merging .po files, and it is pretty good, with some definite caveats:
Pitfall 1: If there is no matching base string in the second .po file, the translations do not get merged in. This becomes an issue when we need to merge in entirely new translations, as they are not added properly.
Pitfall 2: The first file of two to be merged in is always prioritized. This can result in outdated translations being kept over newly created ones.
Both of these issues are fairly avoidable if you are super familiar with
msgmerge and how it works, but there few people at our entire company who can claim that. These issues presented themselves when we introduced a new concept: master files. Master files seemed like a great idea. The concept was to ensure that everything in the master files was fully translated and approved by Hootsuite administrators, and would be locked from being translated by our community translators. By prioritizing those files, our dashboard would have the highest possible translation quality.
There were two main issues with this strategy: we didn’t consider
msgmerge, and nobody really understood master files or how to use them. The master files were new, so they didn’t contain all the strings from the dashboard. And new strings that were added to other files weren’t properly merged in because of the limitations of
msgmerge. Furthermore, our documentation never reflected the update to master files. All new (even paid, professional) translations were merged into the original .po files, and never made it into master. Over time, this led to picky bugs with our translation in which devs had to invest hours of time.
We were dealing with missing translations (caused by mismatching .po files and incorrect
msgmerge use), a general lack of knowledge about how translation worked at Hootsuite, and the count of mysterious, undiagnosed bugs was on an upward trend.
Fast forward two years to the creation of the Internal Tools team, and the realization that our translation system that had once worked was limping, and needed surgery.
The “New” New Way
Enter Smartling – our new external translation partner. Smartling seemed to fill the gaps we needed. They provide a user-friendly dashboard for submitting new translations, have a repository connector that automatically syncs their dashboard with a git repo, and the work is all done by them. Rather than painstakingly writing our own automation scripts, the automation was mostly complete for us. We could simplify our existing scripts, and streamline the process. This was particularly welcome to our mobile teams, who had previously dealt with an extensive and headachey process when submitting new translations.
Of course, it wasn’t quite that simple. We had some bumps along the road, such as the repository connector throwing us unexpected curve balls. It seemed like a great piece of software (and it is, now that we’ve figured it out), and the setup looked easy and clean. Documentation was only one page, which is awesome .. until something goes wrong. Because the repository is still a new piece of tech, not a lot of humans have been using and debugging it yet.
We began our repository journey by setting up our mobile repositories. We wrote our config file according to Smartling’s template file, and pressed go. And… nothing happened. Worse, it failed silently. So we had no idea why our logs simply read: “Starting Repository Connector” and then, three milliseconds later, “Stopping Repository Connector”. After days of communicating with Smartling, checking for typos, and staring at the log files on our server, Smartling came back to us with an answer: it was a bug on their end. Our first encounter with the Repository Connector being new technology. Fortunately, Smartling worked closely with us throughout our integration process, and they fixed the bug within a day!
We began to integrate our Web Ops translation repository with the repository connector. With the blocking bug out of the way, we were optimistic. I added an entry to the config file, and stopped and started the connector. I blew shit up, and not in a good way: errors on errors. It turned out I had mistyped the file path to the smartling-config.json file in the repository. To work correctly, the repository connector looks for a configuration file in the repository so it knows which languages to sync, and the file types to look for. Obviously with an incorrect path it failed to find it, and failed. I fixed the path and restarted the connector again. And everything kept blowing up. For the same reason.
Queue me looking for typos, staring at the logs, and generally repeating the process of bashing my head against a wall to figure out why. Eventually we decided to wipe our server and re-run our ansible playbook to provision it. With a clean wipe and the new config file, the repository connector made a commit. We now suspect that somewhere there was a reference to the old repository that was configured incorrectly, and changing the config didn’t fix it. This is an ongoing problem we experience when changing and updating files. The repository connector can be very finicky, and we occasionally have to re-provision the server to solve problems. This is not something that should happen as regularly as it does, and we’re still looking for ways to solve it permanently.
Trouble was still afoot: it somehow looked like the one commit we managed to get was a fluke. Our dashboard repository (the main connection repository for hootsuite.com) wasn’t working, either. At this point, though, we figured out something exciting: putting the repository connector logs into DEBUG mode, rather than the standard INFO.
DEBUG mode finally gave us information. A lot of information. So much information that it was hard to parse through it all to get to anything related to our problem. It is not an exaggeration to say that it took three developers a full work day to even narrow down the logs to get to what we needed. Eventually, my teammate stumbled upon a mysterious error with the commit. However, while the error logs did finally tell us there was something blocking the commit, it said nothing about what was doing the blocking. It was time to outsource the problem. Smartling had been there for us in the past, so we sent them our error logs to get another set of eyes on the problem.
When Smartling got back to us about our problem, my entire team was out of the office except me. The theory was that having strings marked as “In Progress” was stopping anything from being committed. I tested a small file by marking all strings as complete, and we finally got a commit. It was bittersweet. On the one hand, we finally had an answer to why nothing was working. On the other hand, our dashboard has thousands of “In Progress” strings.
A string is marked as such when it is poised to be translated, but does not yet have a translation. Much of our translation is done through our wonderful community of volunteer translators. “In Progress” strings are the only ones that they can translate. We requested a change in the configuration of the repository connector to allow it to work even if all the strings weren’t fully translated. Smartling responded quickly and willingly, but it would take several weeks to complete. The only issue with that was our rollout happened before that, so Smartling pointed up to their specific API’s that we could use to build a workaround while the change was still pending.
We built a workaround. I won’t go into too much detail about the workaround, since it only had a short lifespan at our company, but we used direct calls to Smartling’s API to do what the repository connector wouldn’t do, and commit the “Completed” strings regardless of how many were not yet finished. We successfully transitioned our translations over to Smartling with the workaround in place. The new repository connector is now in, and translations are once again running relatively smoothly.
Thanks to our teamwork on this project, we had it working. We would have been unable to achieve the smoothness of our translation now without our dedication to communication and debugging, no matter how painful that debugging sometimes felt. Tiny iterations on the project made it possible to achieve something a little bit at a time, and our partnership with Smartling gave us the support we needed to complete our migration.
An Ongoing Project
As with any large, global web project, we will never be able to sit down in a meeting and announce that translation is finished. Localization is an ongoing struggle, especially as Hootsuite continues to grow as a company. We’re still dealing with the ripple effect from the switch and have kinks to iron out.
We’re also still ironing out our workflow for translations. We love our community of translators, and we’re happy that people want to help translate Hootsuite for the world. However, there are issues to be worked out: what is the best way to monitor community translations? We know that paid, professional translations can be public immediately. We are considering a system of “up-voting” community translations so that a submitted translation must be up-voted by at least some number of other translators before it can go live.
What lessons did we learn from this process?
- It’s absolutely necessary to share information amongst developers to avoid single points of failure. Therefore, our new translation system is extensively documented. At least 5 people worked on it, with more every day. Our translation process is much more transparent, which is the way it should be.
- It’s not always necessary to throw away everything when upgrading. We still have pieces of the old translation system that were working well: The Geckoboard – a notification system so we can instantly see when the translation job has failed. (Image Above)
- Monitoring and visibility into a process is key. We even added extra monitoring to what we had before. The translation job will now email the Internal Tools team if it fails, so it’s extra impossible for us to miss. It’s not perfect monitoring, since we aren’t directly monitoring the repository connector, but it’s always been helpful for us to know first thing if something went wrong with translation, so we can address the issue for our customers before it affects their experience.
- Working closely with our external provider is a huge benefit. Smartling was happy to work with us to fine tune the repository connector as we discovered the new edge cases that can sometimes get missed in the development pipeline. It was a benefit to them and us that we integrated the new repository connector.
New software is hard. It hasn’t been lived in yet. Problems are bound to come up, and we need to be prepared for that.
We’re still working on our translations, and we’re making them a little bit better every day. Moving to the repository connector was a good choice. Despite the bumpy road, it streamlined our process, and it does seem a bit like the magic bullet. But that’s only where we are today. Who knows what translation will look like for us two years from now? The landscape of translation and localization is continuously changing. All we’re doing is keeping up.
About the Author
Alice is a Software Developer Co-Op on the Internal Tools team at Hootsuite. When she’s not coding, she loves singing, hiking, and long walks on the beach. Connect with her on Twitter @msalicefredine.