There aren't a lot of companies that can credibly claim to be world-class tech infrastructure builders on par with the cloud giants. But Bloomberg, at the heart of the world's financial system, wouldn't have it any other way.
The bulk of the services offered by Bloomberg's financial empire run inside company-managed data centers filled with "highly tuned" Linux servers refined over the years to deliver a huge amount of real-time data to its customers, said Shawn Edwards, Bloomberg's chief technology officer in a recent interview with Protocol. It built its own private network to serve those customers — institutional investors, hedge funds, large banks — long before the cloud providers were operating at their current scale, because "some things don't lend themselves to the public web," he said.
But that doesn't mean Bloomberg is ignoring modern enterprise computing trends. It has worked extensively with AWS to offer services through the cloud giant to customers that want to work on the cloud, and has embraced new ideas such as containers and Kubernetes alongside systems that have been running for over a decade.
"We're big believers in evolution, instead of always having to write something brand new and have to take it over," Edwards said. The result is a sprawling array of enterprise tech that gives over 6,000 Bloomberg engineers the infrastructure to manage 200 billion messages a day containing market-moving information.
A cloud before the cloud
Bloomberg rolled its own technology from the very beginning.
"Bloomberg always had a modern architecture," Edwards said. "Chuck Zegar and Tom Secunda, two of the founders along with Mike Bloomberg, they had built kind of a web model before the web."
First released in 1982 for Merrill Lynch, Bloomberg's centralized computing model in New York powered what we now know as "the Terminal," the source of the lion's share of Bloomberg's revenue and profit. The Terminal is basically the popular shorthand for Bloomberg Professional Services, the arm of the company that sources real-time trading data from stock exchanges around the world and feeds it to traders and investors desperate for an edge.
That original system was written in Fortran, a programming language that predates COBOL, but modified in C++ around the time Edwards joined the company 17 years ago, he said.
Around that time, Bloomberg made several other changes to its underlying infrastructure that would eventually become blueprints for future distribution computing applications. It built a new user interface in JavaScript that ran server-side with "a lightweight toolkit" running on the client side, years before node.js simplified the process of running JavaScript — the most popular programming language running the last eight years, according to Stack Overflow — on both the server side and client side of an application, Edwards said.
It also began to organize its applications around a service-oriented architecture, building its own middleware that was similar to an open-source called gRPC released by Google in 2015 that brought the concept of microservices to a larger audience.
And on the hardware side, Bloomberg made a big bet on OpenStack, which over the years has become an example of how not to run an open standards organization. OpenStack was a response to some of the cloud computing concepts pioneered by AWS, but designed for and by tech vendors catering to companies that thought they still wanted to manage their own data centers.
Over time, it became clear to a lot of those end users and vendors that nothing was going to stop AWS, and support for OpenStack fizzled. But it still provided a solid blueprint for companies like Bloomberg that had already invested a ton of money in data center infrastructure, and Bloomberg continues to run much of its operation on that combination of hardware and software design principles.
Teaching the machines
Today, Bloomberg is still operating much of the same technology but with a few modern flourishes here and there.
OpenStack can't solve all its needs, and Bloomberg does run "purpose-built dedicated hardware for things that require it," Edwards said, such as "real-time data that doesn't belong on [virtual machines]."
It doesn't actually build its own servers like the major cloud companies do, but it is very picky about the hardware it puts into its environment and tweaks the Linux kernel running on those servers around its unique needs. The company has "experimented" with special-purpose chips like FPGAs (field programmable gate arrays) and hardware accelerators, Edwards said, but for the most part relies on off-the-shelf hardware customized by its engineering team.
Bloomberg does use public cloud services for what Edwards called the "dot-com" parts of its business, such as the media properties like Bloomberg News. It also meets customers where they are; if Bloomberg customers are running their own servers in public clouds, the company has worked with AWS and Google to link its own infrastructure with the public cloud servers used by its customers, he said.
Edwards is currently focused on improving Bloomberg's use of machine-learning techniques to improve its services.
The company thinks it has one of the best optical-character recognition systems in the world, which allows it to pull text and objects like tables out of company filings and financial statements and present it in a readable format. It's using machine learning to help predict the price of a bond, which is more complicated than it might sound because bonds trade far less frequently than stocks, and it's also using these techniques to help traders prioritize incoming messages from clients and capitalize on trade opportunities.
"We're a big data company, but we don't have nearly the amount of bytes in storage that Instagram has. But we have an embarrassingly large amount of heterogeneous data sets," Edwards said. "So when we say we understand documents, it's because of the 30 years of putting this together."