Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!
LLMOps is a thing—at least for now. Commentary by Luis Ceze, CEO and co-founder, OctoML
“For years, MLOps has operated under the assumption that traditional models are a bespoke thing when, in fact, models are just code + data. And throwing MLOps into the mix rather than just aligning with DevOps has only added unnecessary complexity.
But large language models (LLMs) are a different beast, hence the emergence of LLMOps. LLMs have nuances that call for a new set of best practices to build, deploy, and maintain these specialized models. LLMs models are like cloud services, microservices and APIs. They need to be chained together to form an application. But how do you achieve this while making these models behave the way you want them to? This is giving rise to questions around prompt management and model quality.
We’re now having to rethink a lot of what we’ve taken for granted in software engineering. For now, LLMOps will have a role in helping navigate these new waters with prompt management, quality validation and monitoring, etc. It may have a long shelf life, or it may eventually get lumped under DevOps when we reach a future where the vast majority of applications are built with AI—and it’ll largely depend on how these models evolve in terms of behavior and reliability.”
New insights from Enterprise Strategy Group’s State of DataOps. Commentary by Matt Wallace, CTO of Faction, Inc.
“New insights from Enterprise Strategy Group’s State of DataOps study underscore the complexities data practitioners grapple with, particularly around data access, accuracy, and integration. As data volume from various sources burgeons, it heightens these challenges, emphasizing the need for a data-centric approach in organizations.
Access to data emerges as a significant concern, especially in large enterprises where data practitioners are nearly twice as likely to find this challenging. This highlights the increasingly critical nature of collaborating with data to drive improved decision-making across the enterprise.
As advances in AI point to an ever-increasing reliance on data for strategic decision-making, the need for efficient data management strategies becomes paramount. By addressing access and integration challenges, organizations can unlock the full potential of their data, stimulating innovation and maintaining a competitive edge.”
Vertical, industry-specific AI solutions are the future. Commentary by Edwin Pahk, VP of Growth at Aquant
“As generative AI continues to mature, it branches into two distinct offerings: horizontal and vertical models. Horizontal models, exemplified by ChatGPT, offer a broad understanding of various industries and scenarios, while vertical models offer deep domain expertise and are tailored to specific use cases.
There are countless unique models and formulas that power these vertical solutions. But one thing’s for sure: While many perceive generative AI (in general) as the primary driver of change for organizations, it’s actually the combination of generative AI features with vertical, industry-specific AI models that yield a greater impact. These kinds of solutions are particularly important for organizations looking to leverage AI for niche, esoteric use cases.
The models that power industry-specific AI solutions are far more mature and sophisticated than any horizontal solution. For instance, the field service and contact center industry is seeing a massive improvement in key metrics by using vertical AI solutions. These promising solutions are trained with domain-specific data, including data from a company’s subject matter experts, a process that involves feeding knowledge from a company’s top technicians and contact center agents into the system’s dataset. Unlike most models that rely solely on historical data, this approach not only enhances the system’s intelligence but also democratizes the knowledge from a company’s top performers across the entire company, resulting in improved performance across the entire workforce and increased customer satisfaction.
Additionally, as data privacy concerns spike with horizontal AI models like ChatGPT, the vertical AI space is leading a path toward enhanced data protection. Vertical AI solutions maintain a narrower focus, typically sourcing data from consented channels and securely storing it within dedicated repositories. In contrast, horizontal models are encountering backlash due to their utilization of data (often without explicit consent) for model training and content generation. As demand for AI surges, vertical AI is emerging as a reliable and secure option, providing valuable insights without compromising the confidentiality of sensitive information.”
How automation will ultimately lead to better data governance – but only if used cautiously. Commentary by Davide Pelosi, Solutions Engineering Manager at Talend
“Data governance is a crucial element in any organization’s overall data strategy because it gives control over data to meet industry compliance and regional regulations while making relevant and trusted data available to the organization. This is becoming even more important in a world where AI and generative AI efforts are quickly progressing. Organizations need strong data governance to ensure they fully understand their data assets, feeding the right and trusted data to AI systems and avoiding the risk of exposing their unique IP and data to the external world.
The most complex operation of data governance is cataloging data assets, providing a good level of accessibility to the different stakeholders to answer questions like: where do we store private data? Where is this report data coming from? Can I trust this data? Automation can help create data rules and policies and apply those for more seamless regulation compliance and robust business outcomes. Automatic classification of the information is pivotal in a governance initiative. As many of the classification methods used today will become automated in the future, less technical knowledge will be required for the operational processes like stewardship activities for data cataloging, data lineage and more.
While applying automation can lessen the heavy workloads and grunt work of this process, organizations need quality data to train the algorithms to do this in the first place. As automation decreases the technical knowledge required for operational processes and AI implementations become more prominent, human intervention may need to be enforced to check, approve, and improve, reducing the bias behind the automation.
To ensure human-in-loop processes, organizations must implement human intervention and verification techniques that guarantee correct and thorough automated decisions. As organizations continue to adopt new technologies, such as generative AI, companies must treat automation as an addition to the human-established process rather than a necessity in the work environment.”
Google, OpenAI announce enterprise GenAI. Commentary by Arti Raman, CEO and founder, Titaniam
“Recent announcements from OpenAI and Google regarding enterprise GenAI offerings are a strong step towards addressing early concerns around data privacy and model training. However, one of the major questions organizations need to answer before they can design effective GenAI enabled processes, is exactly how and where employees are using GenAI in the first place.
Organizations need to have visibility into Generative AI usage along with a pan-organizational strategy and then they can truly benefit from additional security and privacy controls. As has been true in so many other domains, you cannot manage and secure what you cannot see.”
How is AI shaping skills development and learning? Commentary by Graham Glass, CEO of CYPHER Learning
“Two really positive trends are apparent – and remember, we’re just starting to scratch the surface.
One, AI enables personalized learning at scale without breaking the bank. A linear, broadcast-style learning program inevitably leaves some learners in the dust. Some bored, others flummoxed. An AI-enhanced modern learning platform can serve each user’s learning pace and style. It speeds or slows evaluative activities based on student performance; teachers track everyone’s progress more easily. Students like the process better and retain more. This serves the interests of neurodivergent, differently abled, or senior learners too.
Second trend: AI supports competency-based learning. It spares students irrelevant content. It connects each individual with material that instills particular skills they want or need. As learners start to enjoy upskilling rather than finding it tedious, AI can support a culture of continuous learning, which in a business setting lends productivity and competitive advantages. A business committed to competency-based learning might recruit “dream teams” from within – and launch fewer costly hunts for outside talent. And in a classroom? This approach can better align students with great job opportunities, helping to close the “skills gaps” vexing recruiters across a range of professions.”
How AI is Impacting the Job Market. Commentary by Alon Goren, CEO of AnswerRocket
“Forty-five percent of U.S. consumers express concern over AI’s ability to replace human jobs. There’s no denying that some jobs might be more impacted than others. McKinsey found that AI could automate up to 30% of hours currently worked across the U.S. by 2030. But this shouldn’t be viewed as a threat.
The rise of AI doesn’t render human expertise obsolete; rather, it amplifies it. Generative AI is a prime example of this, as it can help workers with their tedious rote tasks, like time-consuming data analysis, freeing them up to focus on more creative and strategic projects. Let’s take AI agents, for example. Humans can use them as their own personal assistants to identify data patterns and answer their critical business questions in seconds, as opposed to waiting days or weeks for analysis that is effectively stale on arrival. A brand marketer might use an AI agent to understand how a campaign is performing and exactly what they should do to optimize their metrics. A financial analyst could leverage an AI assistant to identify the core drivers behind their company’s revenue over time and forecast performance for the following year.
And as productivity and business benefits increase with generative AI, human employees become even more valuable. Their role can now evolve to serve as the supervisors and validators of AI output, as well as supplying valuable business and situational context that the AI simply doesn’t know. Keeping humans in the loop is essential for successful AI deployments, and as a result, we’re already seeing a surge in new job opportunities brought on by AI.”
How Code Llama will impact the AI market. Commentary by Adrien Treuille, Director of Product Management and Head of Streamlit at Snowflake
“Code generation models have proven to be a fundamental building block of the large language model (LLM) revolution. Meta’s release of Code Llama continues their tradition of driving groundbreaking, state-of-the-art language models, which are nearly indistinguishable from some of the most expensive proprietary models out there. This is a crucial ingredient for allowing the public, researchers, and other organizations to take advantage of this new technology in an open, safe, and extensible manner – further democratizing the power of generative AI for users.
With that democratization, Code Llama will be integral for next-generation intelligent apps capable of understanding natural language. There are three core use cases that I see today: (i) Accelerating developer workflows. Coding is one of the most expensive and error prone activities enterprises engage in. Because computer code, database queries, and business logic may be executed thousands of times per hour, bugs can be extremely costly and developers are in high demand. A model like Code Llama can be used to power next-gen LLM-accelerated coding experiences, such as automatic code completion – where the model guesses what the engineer or analyst will type next — and copilot experiences — where the model is used in a chatbot to help engineers or analysts translate natural language business problems into code; (ii) Increasing the user base for powerful, code-based platforms. Automatic code completion and copilot experiences enable less technical users — think analysts who may not speak SQL well, or even at all — to leverage code-first interfaces. In turn, this democratizes access to powerful enterprise tools, so more users can derive valuable insights from their business; (iii) Empowering more users to seamlessly build LLM-powered applications. Models like Code Llama can be embedded deep within business applications, which use natural language for any purpose. The user interface may not resemble a chatbot, it could be a customer support ticket lookup tool for example. The code generation LLM, in this case Code Llama, would enable this application to automatically generate and execute code snippets based on natural language prompts behind the scenes.
We’re entering a world that democratizes technology previously reliant on coding knowledge, and empowers business users of all technical levels to create even more value — ultimately raising the technical capabilities across enterprises and freeing up developer time to focus on higher level projects that drive impact.”
Addressing the high cost and supply chain issues tied to GPUs. Commentary by Alex Babin, CEO of ZERO Systems
“There is little doubt that GPUs have become essential in the age of AI. With no signs of slowing down, high-performance computing for LLM is becoming hard to come by and very expensive. While GPUs have become indispensable for model training, CPUs are an overlooked alternative for companies that need a GPU substitute for inferencing. It’s important to note that the choice between CPUs and GPUs is not an either-or one. In many cases, using expensive GPUs to run models trained for specific tasks is like putting a nuclear reactor on a scooter. A lot of enterprise clients are worried about runaway costs of running models and are looking for alternatives to reduce cost of ownership. The bottom line is that CPUs cannot always replace GPUs when related to training LLMs but they can effectively reduce costs when applied to running pre-trained models for specific tasks. This move will democratize the GenAI market when related to compute costs but will also require CPU makers to develop new competencies related to partnerships and R&D, as well as developer support, for the new hardware to be easy to use and, ideally, remain compatible with existing AI paradigms.”
How AI Could Reshape the Classroom This School Year. Commentary by Jessica Reeves, COO, Anaconda
“Generative AI (GenAI) will radically change the trajectory of education. While I believe this is ultimately a good thing, there are important considerations that educators and school administrators must consider. In the short term, more educators will need to be trained on how best to prompt GenAI tools to leverage the technology and automate tasks like creating lesson materials and deal with thorny issues like cheating. Students will also need training on how to use AI as a self-learning tool for tapping into their innate curiosity and learning to spot chatbot hallucinations and find reliable sources of information. However, students, educators, and school systems must understand what is at the center of these large language models: data.
Long term, school districts and administrators must contend with the thorny data challenges that are already emerging, whether that’s preventing biased data from corrupting GenAI applications used by students or protecting against data leakage and exfiltration. The emergent data challenges that the broader industry faces will only be magnified in school systems and requires trusted data partners to help bring governance to these powerful tools.
While there are challenges, the promise of GenAI in the classroom can’t be understated. Educators will be empowered to do what they’re best at – teaching and inspiring kids. Students will be able to explore their own latent interests and creativity. Now, school systems and administrators must learn to harness this change and deliver a secure, transformative experience.”
Sign up for the free insideBIGDATA newsletter.
Join us on Twitter: https://twitter.com/InsideBigData1
Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/
Join us on Facebook: https://www.facebook.com/insideBIGDATANOW