The age of artificial intelligence (AI) is upon us. While one of its goals is to optimize businesses for better gains, organizations are increasingly faced with the challenge of managing the growing computational demands associated with AI workloads.
This has led to a shift in cloud computing strategies and decisions, with many organizations exploring hybrid cloud environments as a viable solution. Cloud trends report from Radix show that 56% of large companies have adopted the hybrid cloud strategy as of 2024.
While the hybrid cloud helps organizations leverage the power of public clouds for bursting and resource-intensive tasks while retaining control over sensitive data and critical applications in private clouds, can it help optimize growing AI deployments and ensure data security and compliance?
In this exclusive interview, we sat down with Shawn D’Souza, Global Hybrid Cloud Transformation Leader for IBM Consulting at IBM, to explore the intricate relationship between hybrid cloud and generative AI and how businesses can use this relationship to reap the full value of their AI investments while ensuring robust data security and regulatory compliance.
About Shawn D’Souza
Shawn D’Souza is the Global Hybrid Cloud Transformation Leader for IBM Consulting, responsible for leading teams globally that serve clients across the cloud lifecycle spanning Advisory, Migration, Application Modernization, Cloud Native Development, and Digital Product Engineering.
Prior to this role, Shawn served as Americas Hybrid Cloud Transformation Leader and was also the Global Chief Technology Officer for Hybrid Cloud Services in IBM Consulting. He has also held roles leading cloud practices for Financial Services and North America.
Shawn started his career at CSC developing software products for Insurance and Banking and holds a degree in Electronics Engineering from the National Institute of Technology, Surat, India.
Prior to IBM, Shawn served as the Chief Architect / Director of Enterprise Architecture and Software Engineering at Penn Mutual Life Insurance Company.
Key Takeaways
- The era of AI brings cloud challenges, with hybrid cloud adoption rising to balance gains with data security.
- IBM’s Shawn D’Souza highlights the opportunities and challenges of hybrid cloud and AI and emphasizes the importance of using intentional design to achieve the best ROI.
- Challenges include cloud sprawl, skill gaps, and security risks, and enterprise needs to consider use case, model constraints, reliability, costs, and scalability.
- Designing a hybrid cloud for AI means a tailored infrastructure that is flexible to adapt to future trends, including Gen AI advancements.
- Show Full Guide
Designing a Hybrid Cloud Environment for AI Projects
Q: There are different cloud computing arrangements, but it seems the hybrid cloud is the one that’s gained more traction in recent years. How has the hybrid cloud evolved in the age of AI?
A: In general, hybrid cloud has proven to be the right strategic choice for businesses, especially as they digitally transform. Compared to a single cloud alternative, a hybrid approach brings various benefits, including cost, performance, data security, and regulatory compliance.
As businesses proceed on their AI journeys and deploy multiple use cases in production across complex business workflows, they will need to leverage multiple AI models across various heterogeneous environments.
A hybrid cloud approach is the only viable choice to deliver these solutions cost-effectively, optimizing model tuning and testing as well as data integration/movement, and enabled with the right governance and AI guardrails such as explainability, transparency, and ethics.
However, realizing all these benefits requires a significant shift in how enterprises build their hybrid cloud environments — moving on from “hybrid by default” to a more intentional “hybrid by design” strategy.
Running a Hybrid Cloud by Default Can Ruin Your ROI
Q: You mentioned the concept of a “hybrid cloud by design” and how it differs from a “hybrid cloud by default.” Please can you explain this better and how it affects businesses using hybrid cloud infrastructure?
A: While the hybrid cloud has become the obvious choice today, we’re seeing that enterprises are not actually experiencing positive outcomes or financial benefits from their cloud investments.
Up until now, we’ve seen many organizations fall victim to a “hybrid by default” approach, where they’ve adopted cloud in pockets as they pursued “quick wins,” resulting in inconsistencies and siloed environments that drive up complexity and costs. A lack of intentionality in hybrid architecture can lead to low ROI in cloud programs and difficulty aligning technology decisions to business priorities. This is further exacerbated because of increasing data volumes, skilling challenges, cybersecurity, and the adoption of Gen AI.
Hybrid by design means adopting an intentional approach to structuring your hybrid, multi-cloud IT estate to achieve key business priorities and maximize ROI. By shifting an organizational mindset and adopting a hybrid-by-design approach, IT leaders can recalibrate transformation programs to support business imperatives and better understand how generative AI can amplify the value of the cloud.
At the core, it’s all about purposefully building an architecture that’s ready for what’s to come — new business, changing customer demands, new processes, and skilling requirements.
In our experience, this intentional hybrid by-design approach can help enterprises return up to 3.3x more ROI compared to other options and is typically achieved through:
- Drastic improvement in Business Acceleration, moving from siloed / misaligned business and IT roadmaps to executing programs with product-centricity.
- Substantial increased developer productivity, enabled through a consistent development of “Gen AI” enabled experience regardless of cloud platform choices.
- Improved infrastructure cost efficiency and overall TCO [total cost of ownership] benefits through optimal workload placement and streamlined GitOps and Day2 processes.
- Strengthened security posture through standardized “by design” practices.
- Tremendous acceleration in Gen AI adoption across both business and IT.
How to Determine Which AI Workloads Should Go On-Premise or Public Cloud
Q: What are the critical factors to consider when deciding which aspects of an AI project belong on-premises vs. in the public cloud?
A: It’s not a one-size-fits-all approach. There are several factors to consider when determining which AI project or workload belongs where. Here’s what we typically see in terms of critical factors:
- Enterprise Use Cases — Identifying the use case for your AI project is key. The most common enterprise use cases leveraging AI currently are Customer Service Workflows, HR, Sales and Marketing, Operations, Finance, Supply Chain Transformation, Code Conversions, and IT automation.
- Model Selection, Tuning, and Optimization — Depending on the use cases, appropriate models such as language, image, voice, or action models might have constraints as to where and which environments they can run in.
- Model Usage, Tuning, and Optimization: The landing zone choices will vary substantially depending on the use cases and type of training data needed.
- Reliability — This is a top consideration for AI applications, as they often involve critical tasks that require high accuracy and availability.
- Costs — Infrastructure spend can skyrocket through underutilized or unoptimized GPUs and models or with high data movement between public clouds
- Performance, scalability, and replicability of use cases — typically limited by the heterogeneous nature of the environments and IT stacks these models run in
How to Configure a Hybrid Cloud for AI Workloads
Q: Can you walk us through the process of designing a hybrid cloud infrastructure specifically tailored to the demands of AI workloads?
A: First, let me start by explaining the Gen AI adoption workflow for any typical enterprise use case realization:
- Most enterprise clients start with pilots with pre-defined use cases and pre-tuned Language, Image, Voice, or Action Models, to assess viability.
- Subsequently, these models get fine-tuned and trained using client-specific data sets.
- The IT stacks, infrastructure choices, and model optimization required vary substantially depending on the use cases, models selected, and type of training data needed.
Then, to deploy generative AI with enterprise-class guardrails, you need to design your environment with the following considerations in mind:
- Invest in infrastructure that’s tailored for training and inferencing. GPUs and TPUs can handle the large complex models and high-compute power of AI.
- Focus on data placement and data integration for training and tuning.
- Embrace a hybrid cloud platform such as Red Hat OpenShift to deliver effective model replication across different environments and workflows and scale inferencing performance.
- Implement robust security protocols, conduct thorough compliance and risk assessments, and ensure alignment with all necessary regulations.
Hybrid Cloud Design Could Offer More Control for Data Security
Q: Data security is paramount for AI projects. How can a hybrid cloud environment be designed to ensure robust data protection?
A: Adopting a hybrid-by-design approach enables organizations to simplify and standardize IT security tools and processes, therefore creating a higher standard for cloud security measures.
By avoiding unnecessary data movement and increasing the speed of detecting and addressing cyberattacks, no matter where a workload runs, this intentional architecture ensures greater consistency of data and processes and ultimately makes organizations better equipped to manage shifting regulations.
Factors That May Affect ROI in Hybrid Cloud and How to Address Them
Q: Profit is central to business investments and AI is no exception. What are the common factors hindering ROI progress from cloud strategies, specifically when it comes to AI initiatives? And how can they be avoided?
A: Many factors contribute to businesses’ limited ROI from cloud investments.
Slow adoption, unrealized use cases, and unaddressed cloud sprawl can hinder cloud strategies. Increasing data volumes from AI, skilling challenges and gaps, and increases in cybersecurity risks make matters worse.
To put it succinctly, successful, revenue-driving AI deployments are not possible without an intentional hybrid cloud architecture.
With Gen AI specifically, many enterprises have started with pilots using publicly available models or without necessarily understanding data or hybrid by design implications. As a result, we see things like:
- Infrastructure spending for these enterprises has skyrocketed due to underutilized or unoptimized GPUs and models and high data movement between public clouds. We’re seeing a 3.4x increase in generative AI spending in the last months alone.
- Scalability and replicability of use cases are limited by the heterogeneous nature of the environments and IT stacks these models run in as a result, 90% of pilots will not move into production in the near future
- Lastly, many enterprises are running into model and data governance challenges, hindering ROI further.
Trends That Could Shape Hybrid Cloud for AI in the Future
Q: Looking ahead, what emerging technologies or trends do you see influencing the way hybrid cloud environments are built for AI?
A: In addition to advancements in pure Gen AI capabilities, we are seeing a massive uptick in platforms geared toward increasing AI for business.
In addition, we are seeing an emergence of industry and domain-aligned vision-language-action models and agents to accelerate Gen AI adoption across enterprises.
Finally, we are seeing enterprises embrace open architectures, open source models, and open ecosystems (as opposed to proprietary alternatives) as the foundation of their GenAI journey.