At VentureBeat’s AI Impact Tour, Microsoft explores the risks and rewards of gen AI

Presented by Microsoft

VentureBeat’s AI Impact Tour just wound up its stop in New York City, welcoming enterprise AI leaders in an intimate, invitation-only cocktail salon hosted by Microsoft at the company’s Flatiron office. The topic: How organizations can balance the risks and rewards of AI applications, as well as the ethics and transparency required.

VentureBeat CEO Matt Marshall and senior writer Sharon Goldman welcomed Sarah Bird, global lead for responsible AI engineering at Microsoft, along with Dr. Ashley Beecy, medical director, AI operations at New York Presbyterian Hospital and Dr. Promiti Dutta, head of analytics, technology and innovation, U.S. Personal Bank at Citi to share insights into the ways generative AI has impacted the way their organizations approach industry challenges.

On choosing impactful, sophisticated use cases

What’s really changed since generative AI exploded is “just how much more sophisticated people have become and their understanding of it,” Bird said. “Organizations have really demonstrated some of the best practices around the risk or reward trade-off for a particular use case.”

At NY Presbyterian for instance, Beecy and her team are focused on carving out the risks versus rewards of generative AI — identifying the most crucial use cases and most urgent problems, rather than applying AI for AI’s sake.

“I think about where there’s value and where there’s feasibility and risk, and where the use cases fall on that graph,” Beecy explained.

Patterns emerge, she said, and applications can be aimed at reducing provider burnout and improving clinical outcomes, patient experience, making backend operations more efficient and reducing the administrative burden across the board.

At Citi, where data has always been a part of the enterprise’s strategy, so much more data is readily available, along with magnitudes more compute, coinciding with the explosion of gen AI, Dutta said.

“The advent of gen AI was a big paradigm shift for us,” she said. “It actually put data and analytics in the forefront of everything. All of a sudden, everyone wanted to solve everything with gen AI. Not everything needs gen AI to be solved, but we could at least start having conversations around what could data could do, and really instilling that culture of curiosity with data.”

It’s especially critical to ensure use cases align with internal policy — particularly in highly regulated industries like finance and healthcare, Bird said. It’s why Bird and her team look at everything they’re shipping to ensure that it follows the best practices, has been adequately tested, and that they’re following that basic tenet of choosing the right applications of generative AI for the right issues.

“We partner with customers and world-class organizations to figure out the right use cases because we’re experts in the technology and what it can do and potential limitations, but they’re actually the experts in those domains,” she explained. “And so it’s really important for us to learn from each other in this.”

She pointed to the mixed portfolios that both New York Presbyterian and Citi have, which combine both the immediate-win applications that make an organization more productive as well as the use cases that leverage proprietary data in a way that makes a real difference — both inside the organizations and for the users they directly affect, whether they’re patients or consumers worried about their finances. For example, another Microsoft customer, H&R Block, just released an AI-powered application that helps users manage the complexity of income tax reporting and filing.

“It’s good to be going for that really big impact where it’s worth using this technology, but also getting your feet wet with things that are really going to make your organization more productive, your employees more successful,” Bird said. “This technology is about assisting people, so you want to co-design the technology with the user — make this particular role better, happier, more productive, have more information.”

On the challenges and limitations of generative AI

Hallucinations are a well-known drawback to generative AI, but the term is incongruent with a responsible AI directive, Bird said, in part because the term “hallucination” can be defined in a variety of ways.

First of all, she explained, the term personifies AI, which can impact how developers and end users approach the technology from an ethical standpoint. And in terms of practical implications, the term is often used to imply that gen AI is inventing misinformation, rather than what it actually does, which is changing the information that was provided to the model. Most gen AI applications are built with some form of retrieval augmented generation, which provides the AI with the right information to answer a question in real time. But while giving the model a source of truth, which is what it uses to process the information, it can still make mistakes when it adds additional information that doesn’t actually fit the context of the current query.

Microsoft has been actively working to eliminate these kind of grounding errors, Bird added. There are a number of techniques that can greatly improve how effective AI is, and they hope to see continued progress in terms of what’s possible over the next year.

On the future of generative AI applications

It’s impossible to correctly predict the timeline for AI innovation, but iteration is what will keep driving use cases and applications forward, Bird said. For instance, Microsoft’s initial experimentation when partnering with OpenAI was all testing the limitations of GPT-4, trying to nail down the right way to use the new technology in practice.

What they discovered is that the technology can be used effectively for scoring or labeling data with close to human capability. That’s particularly important for responsible AI because one of the major challenges is reviewing AI assist/human interactions in order to train the chatbots to respond appropriately. In the past humans were used to rate those conversations; now they’re able to use GPT-4.

This means Microsoft can continuously test for the most important aspects of a successful conversation — and also unlock a good amount of trust in the technology.

“As we see this technology progress, we don’t know where we’re going to hit those breakthroughs that are meaningful and unlock the next wave,” Bird said. “So iteration is really important. Let’s try things. Let’s see what’s really working. Let’s try the next thing.”

The VentureBeat AI Impact Tour continues with the next two stops hosted by Microsoft in Boston and Atlanta. Request an invitation here.

VB Lab Insights content is created in collaboration with a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact [email protected].