em360tech image

The next episode of the Data Cloud Podcast features an interview with David Woodhead, Head of Engineering for Aladdin Studio at BlackRock, hosted by Dana Gardner, Principal Analyst at Interarbor Solutions.

We explore BlackRock's comprehensive data strategies, highlighting the evolution of the Aladdin Platform from an analytics factory to an intelligence factory – all built on modern data infrastructure.

The conversation also touches on AI integration, the complexities of managing large scale data distribution, and the cultural and technological shifts necessary to support these advancements.

[Listen to the discussion or watch it.]

Dana Gardner: Welcome to the Data Cloud Podcast, Dave. We're very happy to have you with us.

David Woodhead: It's good to be here.

Dana Gardner: Putting an organization's data resources into top form for analytics and business intelligence (BI) benefits goes a long way to ensuring quality processes and enhanced business outcomes. But for BlackRock, a leading asset manager and technology provider, modern data infrastructure is also the path to productizing financial services.

Tell us how BlackRock developed the Aladdin Platform internally as a way to share data programmatically, and then how that led to an external commercial product, and then also Aladdin Studio.

David Woodhead: Sure, Dana, I appreciate the question. Aladdin as a platform has been around for just over 20 years. I have the benefit of working at a company that has always understood the value of data.

When you hear us speak about the language of the portfolio, what we mean is we're stitching together all of the constituent pieces that make the investment process work. From the beginning, Aladdin has been driven by data.

We think about data as a utility -- the raw ingredients to bring into a factory -- the market data, the economic data that we bring into the platform, and that then produces the analytics. What's changed today is that data is now at the core of the entire investment process.

We are differentiated. The decisions that we're making on what we're investing in more and more are driven by the data that we have. So, we've transitioned: While we've always had data at the core of what we do, now data is at the center of the investment decision-making process.

Aladdin has evolved from being an analytics factory to becoming an intelligence factory built on top of a rich set of data.

Dana Gardner: Help me understand the differentiation between the rich quality nature of the data, and then the modern data fabric and cloud assets behind that. Is it an equal share or is one more important than the other? And how do you keep both in line?

David Woodhead: First of all, our clients trust us to get their data right. So, the most important thing is that the market data that we're bringing in -- the trading, transactional information that we have from our clients -- is pristine. That evolves around data governance, data quality, checking that everything is right. That's regulated investment data is critical, the most central part of our business.

Then there’s the data that's powering the decision-making. Now that might be as varied as airline fly prices, macroeconomic data, even things like satellite imagery that we might bring in that obviously has a very different shape.

In our single, cohesive data platform, we are on the one hand dealing with very high-quality, regulatory-standard-quality data. And on the same platform, we're able to bring in at scale a very wide variety of data very quickly. And we can quickly bring in new types of data into the platform that are then also powering the decision-making process.

Dana Gardner: BlackRock is not only a top asset manager, you've also become a technology products company. Tell us how that productization took place and what that means for you as an organization. It seems like you have a dual mandate.

David Woodhead: At the beginning, BlackRock was founded on the notion that technology is a differentiator. Like I said, underpinning that is the heart of a data platform. As BlackRock grew, our clients recognized that our technology was differentiated and asked if they could use the same software.

BlackRock pretty quickly answered the FinTech businesses, and we started offering Aladdin as a service. Fast forward 20 years to where we are today. Now we think of Aladdin as a platform. Our clients are some of the world's biggest financial institutions. They are coming into Aladdin, building on top of Aladdin, customizing, extending Aladdin, bringing in their own data and their own perspectives across a very varied set of public-private market assets -- and an entire universe of data that backs those up.

But it fundamentally changes the mindset when you are building an internal software product versus building a product that you know is going to some of these very large finance institutions. The way that you think about the quality of the data product, the quality of the software that you're delivering, it's fundamentally different.

Dana Gardner: Tell us about your role there as Head of Engineering for Aladdin Studio, your background, and how you got to where you are.

David Woodhead: Sure. I'm a data nerd at heart. I started my career at MicroStrategy and my role now at BlackRock is to do exactly what I just laid out.

We think about Aladdin today as a platform and Aladdin Studio is essentially the user experience that allows people to think of Aladdin as a platform. It exposes everything Aladdin knows how to do as a set of API events. The Aladdin Data Cloud built on top of Snowflaketakes all of that information that's generated through our analytics factory and inside of Aladdin and pushes that into Snowflake.

And increasingly we're building a set of developer tools, so understanding that we power innovation when we decentralize it. The world where the investment process is built by a small set of engineers and a team that's in a central organization, and that's long gone. We see in business engineering more and more across all of our clients.

The third pillar of the Aladdin Studio is how do I bring those people in? How do I create environments where they can have research environments. They can start coding, they can build applications, they can automate their workflows. They can automate the investment process through algorithms, and then taking those artifacts that people come and build on top of Aladdin and natively integrate them into the investment process. That's my role is engineering for that platform, Aladdin Studio.

Dana Gardner: And a few moments ago you mentioned that you have transitioned from an analytics engine to an intelligence engine. It's just a few small words, but I think there's an awful lot behind that. Tell us how that transition takes place. How do you get from analytics to intelligence?

David Woodhead: At our roots, where we started our differentiation with our ability to understand the risk of the market, that's what we call our analytics factory. The raw ingredients for the analytics factory are those macroeconomic third-party reference data to assess risk. We call it the green package.

It’s called the green package because these reports were once printed on green paper, so it stood out on a portfolio manager’s desk. It was the most important thing in front of them. So, every day they had this thing called the green package. That was their portfolio risk for the day. It gave them insight into where their portfolios were and what the risk was.

But the decision-making process was largely theirs. What we see increasingly is the decision-making process is also powered by data and the volume of data that they're consuming as they go through the decision-making process.

Increasingly, you need machines to help you either suggest, refine, or curate that decision based on the data that you have -- or maybe even propose ideas that the portfolio manager can react to.

So that's the transition. We're still producing that daily risk. It's still the heart of our data ecosystem. It's the most valuable product that comes out of our analytics factory.

But once I've taken that start of day risk, there's an entire universe of other data sets that are now being fed either into traditional Python environments, into BI tools, reporting tools, or increasingly into AI that are then creating suggestions or driving additional insights that our portfolio managers can have at their fingertips.

Dana Gardner: And in doing this, does the opportunity to have more intelligence brought to bear on a 20-year-old process, does that bring about what we think of as democratization? That is to say, can more people avail themselves of this? Have you been able to, with the intelligence, make it more user friendly across a wider group of people who might want to be programmatic in how they develop their own analysis?

David Woodhead: It's a great question. You know, Dana, I'm a data nerd. I've been doing this for a long time and one of the things that's really exciting is the group of us that work in the data space have somehow found ourselves in the center of the universe technically.

I'm sure some people plan for that, and their careers got in exactly into the right place. I think some of those just happened to be in the data space and wound up here. But the thing that's really exciting about where we are today is our consumer base -- the people that are directly consuming our product -- has grown and grown. And the potential now and for the future is for it to grow exponentially.

You talk about democratization of data. That began with end user tools, such as Excel and Power BI. So, we've been there for a while. But several years ago, we crossed the threshold where more of our users were consuming data programmatically through Python.

 

You know, picture your favorite toolkit. Then they're bringing it down to Excel, not in terms of megabytes, but in terms of the number of queries from bigger numbers of users on the data platform. Then we have this entire new user base, this Python user base. And now something that would require a data scientist PhD to answer is going to be at your fingertips with a simple prompt.

We think about the potential of AI and of course it is really exciting to think about where we're heading. The thing that I found most exciting and gratifying is the universe of customers that we have is basically everybody that can ask a question, right? So that democratization of data, I think we just hit the peak of the kind of the Python, the programmatic interaction curve.

Inside of BlackRock, for example, we have 3,000 citizen developers and business engineers that are interacting with Aladdin, using Python. I think that number has probably peaked. Now we're on the other side of that where increasingly sophisticated workloads are going to be presented through natural language.

That means any user can ask complicated questions of a whole variety of data sets and the AI can then take those away and in an explainable way – and in a governed way, because that's important – they can answer those questions and provide insight that historically required a very special skill set.

So, yes, we have made huge progress democratizing data, but I also think we're just getting started.

Dana Gardner: It's very interesting that even as an early adopter of bleeding-edge technology you are expanding rapidly your addressable market. And though that relationship hasn't always been the case. With technology, a lot of times the more technical and more complex it gets, unfortunately, you have an even smaller addressable market.

You have reversed a major trend in technology adoption.

David Woodhead: A hundred percent agree.

Dana Gardner: For those of us who don't fully understand what an asset manager does, how are people internally at BlackRock availing themselves of this analysis? And also, what are your outside consumers of the productization of your technology doing now? Give us a sense of the use cases that are most prominent.

David Woodhead: When our technology works at its best, most of our users will never think, “Oh, this data's coming from Snowflake.” Or, “We have a new enterprise data platform that's powering this great insight.” The reaction that we want from our users is, “Wow, I'm getting the information that I need, and it's embedded in all of the tools.”

The Aladdin application includes a suite of tools that power the entire investment lifecycle from the portfolio manager who wants to understand and convey what holdings they have, if it's a fixed-income portfolio, if it's an equity portfolio, and how the assets are balanced.

Where is the risk? How do they want to change the structure of the portfolio? So that means bringing down massive amounts of data in, working on those, looking at those in the context of macroeconomics, on forward projections, and then presenting those to the user in a meaningful way. The portfolio manager is the tip of the iceberg behind them.

You have an entire raft of data and trading operators to make sure that the investment lifecycle works as it's supposed to. So, when I'm successful, that means that the data our team provides is making the investment process work seamlessly. One of the key metrics that we think about when we talk about the efficiency of the investment process is our straight-through processing rate.

Our first time on analytics, that is all based on our ability to do shift-left quality control. We want our technical data, quality, and business data quality as far left in the pipeline as we can. That lets us hit those straight-through processing rates, those first-time right rates on our analytics. And that means that our users have a great experience from the tool.

Dana Gardner: To that point about the quality and of shifting left – of getting the best you can as early as you can -- please describe the Aladdin Data Cloud.

David Woodhead: Yes, it was around 2019 that we began understanding that our clients are building their own data ecosystems. It was just as we passed the peak hype curve for big data. And then it was more about how we build enterprise data that generates real insights.

Data mesh was becoming popular. This notion of data products was becoming popular. And our clients asked, “How do we take this really high-quality data that's coming out of Aladdin and bring it into our own data ecosystems?”

Or, “How can Aladdin provide us higher-scale data consumption tools that allow us to ask bitemporal questions of the history of our investment process?” We didn't have a data platform that could do that at the time, and that's when we began our conversation with Snowflake.

The purpose of Aladdin Data Cloud is to take all of the data that lives inside of Aladdin and to take all of that third-party data, that macroeconomic data. The key part is we can then unify all that and we call it the language of the portfolio, and then we deliver that to our clients.

The real enabler and the thing that it has generated, I think, is real data joy in our clients. We can deliver that to them where they are. So, they're sitting in their cloud provider of choice. They're sitting in their technology of choice through Snowflake and through data sharing.

That in turn allowed us to bring this rich context into their data ecosystem, and they could start accessing it. But the big differentiator is it's not about taking the raw data and moving it to them. We understand that they can do that themselves.

If you take an example of something like ESG, so environmental, social, governance, how does that play into the makeup of my portfolio? Getting that data is very easy. Mapping that data and understanding that in the context of the investments that you've made is very, very hard. It requires you to do a lot of symbology processing.

How do I bring the data into a common language? We call it the language of the portfolio and what's happened in the last 10 to 15 years is that the complexity of the things that you can buy as an asset manager has massively increased. We have private markets investments, we have public market investments.

The benefit of Aladdin, and a lot of data clouds, is we bring all of that together into a single symbology, a single language, the language of the portfolio. And we deliver it to our clients in a way that they get value from.

And again, that very much is rooted in this notion of data products. So understanding that a product is something that is designed, it's thoughtful, it's governed, it's documented. It has metrics around it of when the data's going to be available, what the quality control metrics are, what the warranties are, and then deliver that with a specific business purpose in mind to our clients so that they can get value from it.

Dana Gardner: We talk a lot in our conversations about the challenges around ingestion and getting the data in good shape for analysis, but you've now taken the analysis and are driving it back out. So, a two-way street. And I'm thinking that other organizations are going to be attempting the same thing.

I want to know what challenges you had in becoming a two-way street in terms of distribution of the data and analytics in a way that your prospective addressable market can consume it.

David Woodhead: I think there's two things. The first thing, and we realized this early on, is that we had to invest in our governance framework. So, understanding who was governing it, who were the stewards, and the product managers, the data product owners, who were the product owners that were going to shape this data.

That was a muscle that we didn't have organizationally before you start thinking about federating that out across the organization. But we'll park that one for now. So, the first thing is making a legitimate investment organizationally and in terms of talent and building out the data governance muscle.

The second thing is recognizing that the platform isn't just one. We love the partnership with Snowflake. It's a key part of our platform, but it's a part of a platform. The platform is very, very complex. We actually have a slide that we show our internal stakeholders that has all the technologies that we use in our enterprise data platform.

And the purpose behind that is to demonstrate that it is a fairly sophisticated machine that you have to put together to deliver a data product. It requires observability, transformation, and governance. A data catalog obviously is going to be at the heart of it.

So, understanding that complexity is much more than simply solving for the data platform. Building that technology muscle, but then also building the governance muscle, too, so that you can deliver and scale high-quality operable data products out to your clients.

Dana Gardner: I'm curious, Dave, what cultural shift did you have to take in going from being an internal supplier to BlackRock for technology and solutions to that external and commercial approach?

Do you have to think differently? Do you have to flex different muscles, if you will, in order to not just satisfy an internal constituency but productize that? Because, again, I think other people are going to be following your lead.

David Woodhead: To be clear, we have always distributed data to our clients. So that was something that we had done before; we'd been sending files. We had less modern ways of sending data to our clients. So that wasn't new to us. Our clients very much depend on us to deliver the highest quality data. So that relationship, that mindset of our product is data, and the data has to be right.

That already existed. I think what we underestimated was the complexity and the scale of the operations engine that you have to have. One of our biggest challenges is fleet management. You know, we operate hundreds of Snowflake accounts. We share data to hundreds of Snowflake accounts and we're managing hundreds of vendors, tens of thousands of pipelines.

The scale that we are bringing in, sending data out has massively grown. That's something that we certainly have to focus on. And again, the complexity of that data has also grown. I think those are the areas where we've been really focused in terms of delivering data as a product to our clients.

Dana Gardner: It sounds like you brought in Snowflake, particularly for the Aladdin Data Cloud, but that it's expanded the model to manage the complexity and scale out to now touch more parts of your organization. Is that fair?

David Woodhead: A hundred percent. You know, you're winning with a technology. When you bring it into a large organization and you see that organic growth, people see they have a tool in the toolkit and, and as long as we understand where it works really, really well, we're getting massive value from it across the estate.

We began our relationship with Snowflake as a vehicle for historicizing all of the data that lives inside of our Aladdin ecosystem and then delivering that to our clients. That was the product. It's now being used across their organization, a variety of use cases, internal and external, and that continues to grow.

And one of the really gratifying parts of being an early partner of Snowflake is we have had an opportunity to go on the journey with them, so seeing the newest technology. One of the things that we're moving into is we're moving away from batch-oriented pipelines to more real-time pipelines.

Technologies like dynamic tables that didn't exist, you know, 12 months ago are changing the way we think about building our pipelines. As the platform evolves, the opportunity for us to use it in more parts of our ecosystem is also presenting itself. So absolutely, it's become a core part of many different parts of our organizational process.

Dana Gardner: What comes next for your organization? What are your priorities in terms of the types of projects in order to manage that addressable market? And then once you've captured it, keep adding more sophistication and service levels on top.

David Woodhead: AI is a focus for us, and of course building the data platform that's going to power differentiated AI outcomes for BlackRock and our clients down the road. Those are very much the key focus for us.

That means building on top of a foundation of higher data quality, and more data, along with the ability to bring in different types of data. So we are very much on a journey to what we call a whole portfolio view, and bringing in public and private markets data together.

We are on the journey, but we're not done with that journey. So that's very much a focus for us. And again, that means supporting different perspectives of data.

For those not familiar private markets data, it's very diverse. A private market investment could be into a wind farm, or it could be into a piece of infrastructure, or a data center. A very different type of asset versus what I'm purchasing as an equity or something from the fixing universe.

Bringing these very different types of data together requires us to build a different kind of technology. And again, this is where some of Snowflake's newer technologies are really enabling us. The ability to support unstructured data sets provides the ability to store data in a vectorized database.

These are things that are powering the transformation into a whole portfolio view. I'd say it's those two things. AI is going to be at the heart of what we're focused on in the next five years, like a lot of our competitors. But that has to be driven off of a very varied, very wide data universe, which provides a whole portfolio view of public and private market data. We don't think anybody else will be able to provide that.

Dana Gardner: Interesting. It seems that for quite some time you've been productizing data and then you productized analytics and then you productized a sophisticated way of leveraging those analytics. And now you're teasing the idea, I think, of productizing AI itself. Is that fair?

David Woodhead: Yes, our AI strategy in my mind is really threefold. The first part is every part of our product has to have AI integrated in it from the start.

When a user comes to Aladdin, increasingly that interaction is going to be more refined, and it will be more natural language driven. Like with a lot of tools you're seeing today, AI will be woven into everything we do.

The second thing is we have AI-as-a-platform. As anybody who's building on top of the AI ecosystem, what are the AI capabilities that Alladin will provide that will give them differentiated insight into their portfolio? As a builder and integrator on the platform, how do I think?

And the third is, how do I power the builder? When I'm building on top of Aladdin and I need a copilot to help me build and solve the problem that I have, how does AI inside of the platform help me as an engineer, business engineer, or portfolio manager who's building or customizing or generating an artifact on top of the system?

Dana Gardner: A lot of AI-enabling of the generation and advancement of more AI?

David Woodhead: For sure. And also recognizing that our role as the owners of the data platform. We know down the road -- and this is when I talk about growing our customer base -- we want to live in a world where any user can, with natural language, ask any sophistication of question against our entire data corpus.

And we can, with explainability, answer that through AI. Now we're leveraging a lot of the AI capabilities that live inside of Snowflake and across our partner ecosystem.

And we're not there yet, but that's the direction and that's where we need to get to. Everything we're doing is underpinning AI outcomes that are going to be used by our clients.

And going back to the core concept, if we get this right, the outcome of that will be a very simple natural interaction from our clients, and they'll never think about this massive complexity that sits underneath of that.

Dana Gardner: The invisible AI enablement. It's very exciting and I hope I can talk with you in a couple years and see how that panned out.

Last question. Because you're the Head of Engineering, I have to ask you, as you've been developing AI in association with these sophisticated data fabric corpus as you refer to it, are there skill sets in the engineering department, those who you work with, that you found you'll need as an engineering organization to accomplish this?

David Woodhead: It's a great question. The answer is that it isn't so much a different skillset. It's a different way to think about problem solving.

We've reached the point where AI isn't correcting our bad code anymore. AI isn't generating just test cases for us. We are at a place where we can think about AI as being an engineer that sits next to me, that I can pair code with. And I can actually outsource some tasks to that engineer.

So, what we are very serious about is how do we reskill and upskill our engineers so that we are leveraging the best of the AI capabilities that are going to help us move faster. And of course, what we're finding is that the people joining the organization from the grassroots, the grads that are coming in from college, they have those skills built in.

They're teaching us in some ways. I think fundamentally, the skills that make you successful as an engineer haven't changed the mindset of how you work with AI, but how you then make the AI more valuable by being a subject matter expert, by being a horizontal connector, by working through the process to generate a differentiated outcome.

That's the human superpower, and I don't see that going away. We will be upskilling ourselves so that we're leveraging the best of the tools so we can all move faster. While we ourselves provide that horizontal glue and we push toward that high value outcome for our clients, that's the mindset change.

Dana Gardner: Well, I’m afraid we'll have to leave it there. Thank you so much to our latest Data Cloud Podcast guest, Dave Woodhead, Managing Director and Head of Engineering for Aladdin Studio at BlackRock. We so much appreciate you sharing your thoughts, experience, and expertise. Thank you so much.

David Woodhead: Thanks, Dana. It's been a pleasure.

(Snowflake supported the creation of this discussion).

[Listen to the discussion or watch it.]