WeChat, China’s biggest messaging app, has apologised for a gaffe, where it translated the phrase “black foreigner” to the N-word.
The mistake was first spotted by Ann James, an American living in Shanghai. She translated an incoming Chinese message into English, which produced the text: “The n****r’s still late.”
The original Chinese message used a more neutral term, hei laowai, or “black foreigner.”
(Editor’s note: The language in the screenshot below has been obscured due to its offensive nature)
WeChat rectified the error within 24 hours, but the company told Chinese outlet Sixth Tone that the translation was based on its neural machine learning engine, that picked up the term from broader usage.
WeChat, which has a huge base of over 900 million users, has been able to translate messages in-app since 2014. It relies on a combination of translation sources, including its own AI engine, and third parties like Microsoft Translator.
The error is reminiscent of other translation engines that have tried to learn from analysing big data. In August, two Chinese chatbots — one created by Microsoft — were taken down after they started posting unpatriotic content about the government.
Last year, another Microsoft chatbot, Tay, was pulled after it started tweeting racist and crude messages.
At its core, ROSS is a platform that helps legal teams sort through case law to find details relevant to new cases. This process takes days and even weeks with standard keyword search, so ROSS is augmenting keyword search with machine learning to simultaneously speed up the research process and improve relevancy of items found.
“Bluehill benchmarks Lexis’s tech and they are finding 30 percent more relevant info with ROSS in less time,” Andrew Arruda, co-founder and CEO of ROSS, explained to me in an interview.
ROSS is using a combination of off the shelf and proprietary deep learning algorithms for its AI stack. The startup is using IBM Watson for at least some of its natural language processing capabilities, but the team shied away from elaborating.
Building a complete machine learning stack is expensive, so it makes sense for startups to lean on off the shelf tech early on so long as decisions are being made that ensure the scalability of the business. Much of the value wrapped up in ROSS is related to its corpus of training data. The startup is working with 20 law firms to simulate workflow examples and test results with human feedback.
“We really spent time looking at the value ROSS was delivering back to law firms,” noted Kai Bond, an investor in ROSS through Comcast Ventures. “What took a week now takes two to four hours.”
Screen Shot 2017-10-10 at 10.28.51 AM
Screen Shot 2017-10-10 at 10.29.34 AM
Screen Shot 2017-10-10 at 10.58.21 AM
The company’s initial plan to get to market was to sell software designed for specific domains of law to large firms like Latham & Watkins and Sidley Austin. Today ROSS offers products in both bankruptcy and intellectual property law. It is looking to expand into other types of law, like labor and employment, simultaneously moving down to serve smaller firms.
LexisNexis and Thomson Reuters are frequently on the butt end of claims made by machine learning-powered data analytics startups emerging in a potpourri of industries. A strategy favored by many of these businesses is pushing products to interns and college students for free so that they, in turn, push their advanced tools into the arms of future employers.
“The work ROSS is doing with law schools and law students is interesting,” Karam Nijjar, a partner at iNovia Capital and investor in ROSS, asserted. “As these students enter the workforce, you’re taking someone using an iPhone and handing them a BlackBerry their first day on the job.”
Prior to today’s Series A, ROSS had secured a $4.3 million seed round also led by iNovia Capital. As ROSS moves to scale it will be navigating a heavy field of mergers and acquisitions and attempts by legacy players to ensure legal tech services remain consolidated.
Deepgram, a startup applying machine learning to audio data, is releasing its machine transcription platform this morning for free. No more will you have to pay for other services like Trint to get the dirty work of automated transcription done. Hint: it has something to do with data.
Machine transcription isn’t solved. In fact, machine anything isn’t solved. And it seems like everyone these days is making haste to build their own Fort Knox of data to solve machine everything. Deepgram’s approach is to make its transcription service free for anyone to upload their audio content and receive searchable text in return.
This approach isn’t particularly unique — as I said, everyone needs data. Don’t forget that Image Captchas are basically a means of forcing plebeians to label image data sets for training machine learning models.
Deepgram is using deep learning for its transcription tool (surprise!) — good old convolutional and recurrent neural networks. Everything is generalized in the free version, but paid offerings might include custom training on company and product names as well as terms of art in a given industry.
I uploaded an hour long interview I did about a week ago to the service to test it out. The file was recorded in a noisy restaurant and consisted of two people having a dialog. The transcription quality was far from perfect — but it wasn’t meaningfully worse than anything else on the market.
I was able to search for a specific quote I remembered and after three attempts, I found the segment of dialog. I wouldn’t be able to copy and paste it without angering the interviewee, but it would have given me the context I needed to tell my story. The search process took about five minutes and, to Deepgram’s credit, it was obvious that searches were using the sounds of words to find more matches. The thing to remember is that the service costs considerably less than more accurate human transcription and will improve with time.
“ASR is not solved,” Scott Stephenson, co-founder and CEO of Deepgram, explained to me in an interview. “It’s solved for specific data sets but with noisy accented call data, any service will do a poor job with it.”
In addition to the platform, Deepgram is also offering a mostly free API for machine transcription. If you use over a million minutes you will be charged — computation is expensive so it wouldn’t make sense to allow someone to troll the company with a 50 terabyte audio file.
While humans still reign supreme in the transcription world, it’s possible that synthesized audio could tilt the odds in the favor of the machines in the near future. Projects like WaveNet and Lyrebird, that generate speech from text, could help to augment systems with data for uncommon words that tend to be the most likely to trip up machine translation systems like Deepgram and those made by the tech giants.
Featured Image: Colin McConnell / Contributor/Getty Images
With a shortage of machine learning developers bearing down on the industry, startups and big tech companies alike are moving to democratize the tools necessary to commercialize artificial intelligence. The latest startup, Petuum, is announcing a $93 million Series B this morning from Softbank and Advantech Capital.
Founded last year by Dr. Eric Xing, a Carnegie Mellon machine learning professor, Dr. Qirong Ho and Dr. Ning Li, Petuum is building software to facilitate two components of machine learning development. First, the team is automating aspects of data preparation and machine learning model selection. This is useful for novices that might otherwise struggle to even make use of common machine learning frameworks like TensorFlow and Caffe.
Once models have been selected, Petuum can also assist developers in optimizing for specific hardware constraints. This means virtualizing hardware to remove barriers — taking out the extra step of managing a distributed GPU cluster.
“The way we treat AI is not as an artisanal craft,” Dr. Xing explained to me in an interview. “We are trying to create very standardized building blocks that can be assembled and reassembled like legos.”
Petuum founder Dr. Eric Xing inside the startup’s offices in Pittsburgh
The point here isn’t to solve every problem in machine learning, but rather to automate enough of the process that industry can move from 0 to 1. That said, Petuum is attempting to build for both the expert and the novice — a tough balance to strike.
“Everyone knows how to use Excel,” asserted Dr. Xing. “A layman can use Excel to create a table. A highly skilled statistician modeling certain phenomenons can still use Excel.”
The other challenge facing Petuum is one of market strategy. As the tech industry grapples with its dumb money in AI problem, many investors have turned to heuristics to manage uncertainty — most popular of which is that horizontal platform AI plays don’t work.
The concern is that it’s difficult to outgun Google and Amazon in the machine learning-as-a-service space as a startup that needs to balance feature development and spending. Dr. Xing deferred to the skill of his team and while he didn’t directly mention it — the goldmine from Softbank won’t hurt. This is something that others like H2O.ai and Algorithmia can’t claim to date.
To the company’s credit, it is starting by going after healthcare and fintech customers. Though in the long run, Petuum doesn’t intend to cover every vertical. Petuum is working with beta testers in different industries so that in the future, outsiders can develop and deploy solutions on top of the platform.
Today’s investment comes from Softbank proper rather than the $93 billion Softbank Vision Fund. It’s unclear whether Softbank intends to shift the investment into the fund in the future. Petuum currently claims 70 employees and says that it will be expanding simultaneously in product, sales and marketing.
Last year, Google showed off WaveNet, a new way of generating speech that didn’t rely on a bulky library of word bits or cheap shortcuts that result in stilted speech. WaveNet used machine learning to build a voice sample by sample, and the results were, as I put it then, “eerily convincing.” Previously bound to the lab, the tech has now been deployed in the latest version of Google Assistant.
The general idea behind the tech was to recreate words and sentences not by coding grammatical and tonal rules manually, but allowing a machine learning system to see those patterns in speech and generate them sample by sample. A sample, in this case, being the tone generated every 1/16,000th of a second.
At the time of its first release, WaveNet was extremely computationally expensive, taking a full second to generate 0.02 seconds of sound — so a two-second clip like “turn right at Cedar street” would take nearly two minutes to generate. As such, it was poorly suited to actual use (you’d have missed your turn by then) — which is why Google engineers set about improving it.
The new, improved WaveNet generates sound at 20x real time — generating the same two-second clip in a tenth of a second. And it even creates sound at a higher sample rate: 24,000 samples per second, and at 16 versus 8 bits. Not that high-fidelity sound can really be appreciated in a smartphone speaker, but given today’s announcements, we can expect Assistant to appear in many more places soon.
The voices generated by WaveNet sound considerably better than the state of the art concatenative systems used previously:
Old and busted:
New and hot:
(More samples are available at the Deep Mind blog post, though presumably the Assistant will also sound like this soon.)
WaveNet also has the admirable quality of being extremely easy to scale to other languages and accents. If you want it to speak with a Welsh accent, there’s no need to go in and fiddle with the vowel sounds yourself. Just give it a couple dozen hours of a Welsh person speaking and it’ll pick up the nuances itself. That said, the new voice is only available for U.S. English and Japanese right now, with no word on other languages yet.
In keeping with the trend of “big tech companies doing what the other big tech companies are doing,” Apple, too, recently revamped its assistant (Siri, don’t you know) with a machine learning-powered speech model. That one’s different, though: it didn’t go so deep into the sound as to recreate it at the sample level, but stopped at the (still quite low) level of half-phones, or fractions of a phoneme.
The team behind WaveNet plans to publish its work publicly soon, but for now you’ll have to be satisfied with their promises that it works and performs much better than before.
October 4, 2017 / Comments Off on Google’s WaveNet machine learning-based speech synthesis comes to Assistant