Financial big data: The traces of trillions in economic activity
Day one of Newsweek's AI and data science in capital markets event examined a range of new data sources.
The financial world has always been about extracting value from information. The difference today is the sheer proliferation of data which can be modelled by scientists armed with lots of computing power. This ranges from numerical price information (which might now be called traditional data) to more exotic datasets, such as satellite imagery or location data from mobile phone apps.
The opening day of Newsweek's AI and data science in capital markets conference in New York began with a look at some new, "alternative" data sources and how these can be employed in a firm's trading strategy. Wall Street has been doing this for some time, and the game of beating benchmarks using a mosaic of data sources is a closely guarded secret; once everybody on the street knows what you're doing, all the alpha gets sucked out of your strategy, according to conventional wisdom.
Firms need to embrace this technology and the brave new world of data science to avoid being left behind; getting in early is a prudent move, said Greg Skibiski, the CEO of Thasos, which specialises in dealing with noisy but rewarding location data. "There are firms that have made a lot of money by getting in early and have just kept quiet about their new data source," he said, "cleaning it like crazy and they have got 20 years of alpha out of it."
Observing carparks using satellite imagery is a way of gauging retail activity, and one which the hedge fund world is well aware of. However, new dimensions are being added all the time and today you can include drones, high altitude balloons, weather data, car telemetry systems and so on.
Some firms even offer synthetic-aperture radar (SAR) which can relay images of the Earth through clouds. A.J. DeRosa, executive vice president of Orbital Insight, a company which monitors some 260,000 carparks, said: "We can even pinpoint ships at sea that have their beacons switched off, they are called 'dark ships'; so we can see there's a ship when there's not been a ping. So this can also help with things like illegal fishing."
Coming from another direction, Hicham Oudghiri, CEO of Enigma, which examines thousands of public data sets, said rather poetically: "Public data provides traces of trillions of dollars in economic activity."
Oudghiri pointed out that publicly available Federal radio licencing can be used to track the spread of fast food joints. How? "Every drive-through installation has to have a Federal radio licence," he said.
There are four essential areas that investment firms need to know about in order to compete, said Tammer Kamel, CEO of Quandl. These are: statistics 101; data science (the icing on the cake is machine learning and AI); some non-financial domain knowledge; and good old capital markets know-how.
He illustrated each area with a dataset that Quandl works with. Firstly, in terms of non-financial expertise, iron ore production levels. This involves geospatial mapping of ports, overlaid with another dataset, automatic identification system (AIS) which is used to position all vessels on the ocean. "AIS is radio waves," said Kamel, "it's messy and noisy and sometimes deliberately obfuscated by those who collect it."
Illustrating the power of data science and AI, Kamel said B2B payment supplies and implied risk can offer a look into the financial health of all firms. He said the application of AI and random forest methods can yield valuable insights from this data; a Sharpe Ratio of over 1.3.
For statistics 101, Kamel said Amazon sales reduces to a problem of extrapolation involving millions of Amazon customers. Lastly, capital markets know-how was demonstrated in the work Quandl has done with FX clearing and settlement utility CLS. This takes real time FX volumes and prices and employs methods like mean reversion and trend following.
This all sounds great. But could we hit a point where secret, sophisticated methods of penetrating firms' performance could become a concern for the regulator? Jonathan Streeter, partner at Dechert LLP, is often asked this question by big hedge fund clients. He thinks it could become a problem.
Streeter, who was the lead prosecutor in the Raj Rajaratnam Galleon Group case, said new data markets now springing up are right at the intersection of data privacy concerns and potential insider trading. When considering the latter, there are three important conditions to bear in mind: is the information material (if a fund is paying for data then they have kind of answered this question already); is it private or public; was there any kind of fiduciary breach in attaining it?
Data from people's inboxes, for example, is clearly not public, while counting cars or scraping websites doesn't pose that problem. When it comes to a breach of duty, Streeter said typically a lawyer's response is "it depends".
"We then have to look at all the contracts and agreements along the chain. Did I agree my data could be bundled up and sold; were the boxes ticked to opt in? The same for vendors; did they get permission to sell this," he said.
"I can tell you, if these boxes have not been ticked, the SEC would love to bring a case in this area, and put their name on this space."