Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!
Harnessing the power of big data for online businesses. Commentary by Reshma Iyer, Director of Product Marketing and E-Commerce at Algolia
“Most online businesses realize that they need to take a more data-oriented approach to their strategy. Even if they don’t start out this way, they are realizing the limitations of existing technology and the backward-looking view is not serving them anymore to deliver a personalized end user experience online consumers want today. Real time data and insights gained from it while using advanced AI technology through various POCs are proving to be highly effective in automating a number of areas while freeing up critical resource time on business priorities.
Whether shopping for products, using digital media content, self-service on-boarding, troubleshooting for a number of technical issues, or designing automated marketing funnel activities, data and AI are taking the center stage. Needless to say, being able to apply AI models to data that’s been processed increases the value of big data and analytics incrementally.”
The dangers of data bias in AI and how data management can help. Commentary by Lewis Wynne-Jones, VP of Product at ThinkData Works
“The novelty of ChatGPT has stirred up a lot of excitement in boardrooms everywhere, and the general availability of large language models has reignited interest in general-purpose AI to support business insight. ChatGPT is not, however, a new technology; what is new, and what’s driven the excitement to a fever pitch, is that until now LLMs were highly specialized tools used by enterprises that were mature enough to dedicate time and resources to operationalizing these models effectively. With its intuitive user interface and low barrier to entry, ChatGPT has generalized this technology for anyone regardless of their technical expertise.
The problem with democratizing this technology is that LLMs are very good at whipping up content, but do so without a lot of controls in place. What this means is that it’s become very easy to “use” Generative AI, but without knowing precisely what’s going into the model we’re using, we risk opening ourselves to bias, error, and false signals. The corpus of data used to train these LLMs is not perfect. Far from it – it’s the public internet.
ChatGPT and other LLMs will be the most egregious examples of this, but for any company looking to add machine learning, natural language processing, or any kind of artificial intelligence into their operations, they should keep in mind that their outputs will only ever be as good as the training data that is provided. Bias occurs when the training data isn’t properly indexed, representational, and complete. In other words, algorithms that are trained on historical data will only ever provide results synonymous with historical trends. The solution is not always more data. For example, a model that analyzes successful loan applications to automate the loan approval process sounds like a good idea, but if it’s using a historical dataset from 1900-present the data will favour a specific demographic at the exclusion of others.
Bias occurs because we use historical data to solve modern problems and the data we use to train these models is not representational. What does representational data look like? It’s diverse, weighted, and indexed in a way that lets analysts examine the model to understand how results were generated. It’s an attractive idea to think of AI as a black box into which we throw questions and receive answers, but we still need to be able to open this box to analyze the machinery. If we can’t, we should not trust the answers we’re given.”
Responsible AI commentary on Schumer’s AI Insight Forum. Commentary by Triveni Gandhi, Responsible AI Lead at AI leader Dataiku
“With Chuck Schumer’s nine-part AI bootcamp … with Zuckerberg, Gates and Musk, it’s vital that Congress consults a complete ecosystem of AI innovators, not just goliaths. The AI ecosystem is massive, and is made up of many different organizations of all sizes. Congress has a checkered history of favoring the incumbents with regulations and AI is too important to lock out participation in these critical conversations. For a comprehensive AI Governance strategy that still encourages healthy competition, involve all of the players – especially the middle of the chain and end users as well. Including the middle layer providers in these conversations is equally important, as the major model providers are only one part of the equation. Ultimately, it will be organizations looking to implement and enable access to new technology that will need to abide by any regulations – it would be a disservice to not include those voices and perspectives in these conversations.”
U.S. Doesn’t Need to Lead AI Regulations. Commentary by Ivan Ostojic, Chief Business Officer of Infobip
“The tortoise may win the race when it comes to AI regulation. Technology over-regulation and under-regulation both pose serious risks. That’s why it might be best that the U.S. not take the lead on this one. Instead, the U.S. should strategize and collaborate with tech leaders to devise the best path forward.”
The Future of Software Development: Balancing AI’s Innovation with Expertise. Commentary by Kevin Kirkwood, Deputy CISO at LogRhythm
“The White House recently unveiled a competition aimed at leveraging artificial intelligence (AI) to bolster cybersecurity efforts. The competition’s primary objective is to incentivize cybersecurity researchers to employ AI technology in the identification and remediation of software vulnerabilities, with a particular emphasis on open-source software. The initiative underscores the growing recognition of AI’s pivotal role in fortifying digital defenses and enhancing the security of software systems.
While AI holds tremendous potential, it is important to recognize its limitations. AI-driven solutions can expedite the development process, improve code quality and enhance security. However, developers should proceed with caution and strike a balance by leveraging AI capabilities with human oversight. One challenge with AI is the risk of false positives and false negatives in vulnerability detection, which could lead to unnecessary disruptions or missed threats. Additionally, AI systems require extensive training data, raising concerns about the quality and representativeness of such datasets. Therefore, developers must maintain active oversight, continually refine AI algorithms, and prioritize ethical considerations to ensure that AI-driven software development remains safe, reliable and resilient against emerging threats.”
FinOps solutions applied to the cloud with AI. Commentary by Erik Carlin, Co-Founder & CPO at ProsperOps
“AI is being applied to transform many different domains across cloud services. As cloud use increases, cost optimization tasks become difficult or impossible for teams to manage — resulting in high bills and wasted spend. One way enterprise leaders can get cloud costs under control is through FinOps solutions that apply AI to take complex cost optimization tasks and transform them into automated, optimized savings outcomes. Ultimately, cloud adopters get more savings, with less risk and less effort.”
Maximizing ROI with AI: Unlocking Time Savings and Automation in Enterprise Training. Commentary by John Peebles, CEO of Administrate
“Efficiency and effectiveness are paramount in today’s rapidly evolving business landscape. AI-powered technology will play a vital role in realizing these objectives, especially in the context of enterprise training. Beyond the immediate gains of time savings, AI introduces a multifaceted spectrum of advantages within enterprises. The following are five key ways AI transforms the training landscape to maximize ROI.
AI-powered technology automates tasks that once consumed valuable time. For example, training departments at larger companies are responsible for training thousands of people, often grappling with complex scheduling. AI tools offer a promising solution, streamlining operations and simplifying the scheduling complexities that training departments grapple with.
AI-driven solutions provide an opportunity for optimizing operational efficiency by automating resource allocation, enabling businesses to channel their efforts toward more strategic endeavors. This is crucial in training where efficient resource management directly impacts program success.
Additionally, AI helps to elevate decision-making. It can harness data-driven insights, fostering a culture of measurement, analysis, and evidence-based decision-making in training. By making data more accessible and actionable, AI empowers trainers and organizations to make informed choices for improved training outcomes.
Companies can also use AI to enhance ROI through lean processes. Its role in unlocking time savings directly contributes to ROI optimization in training. By embracing AI-powered solutions, training departments fortify their competitive edge, delivering better results while maximizing the return on training investments.
Implementing AI allows organizations to accelerate learning content creation and overcome logistical challenges. While generative AI tools like ChatGPT can help accelerate learning content production, ensuring consistency and adaptability to varying learning styles, other AI-powered tools have the potential to solve complex logistical hurdles and streamline internal data that may otherwise be buried within an organization.”
Hallucinations are part of the ‘magic’ of GenAI. Commentary by Victor Botev, CTO and Co-founder of Iris.ai
“As AI technologies become further integrated into daily life, we are seeing growing scrutiny over models hallucinating. While it allows AI tools to fill in gaps and make predictions, it can potentially lead to issues with accuracy, reinforcing biases and creating legal liability.
Despite any speculation on their value, reducing hallucinations needs to be a top priority across the AI industry. Increased transparency about system capabilities and limitations, rigorous testing protocols, and emphasis on explainability can all help minimize this problem. We also need to carefully select quality metrics that measure factuality, biases, and coherence.
Ultimately, minimizing hallucinations is crucial for building user trust and delivering accountable AI systems that provide real value. The onus is on AI developers and companies to address these challenges through research and enhanced machine-learning practices. With diligence and collaboration, the AI community can develop systems that augment human intelligence without unwanted distortions or blind spots.”
Want to build your own AI models? Consider this first. Commentary by Berk Birand, co-founder and CEO at Fero Labs
“With the majority of enterprises using AI systems in some capacity, many have considered the idea of building their own versus licensing from a vendor. Their belief is that an in-house build would better secure their data, provide a proprietary edge over competitors, and to potentially lower costs. However, the DIY reality does not always match the dream.
For example, we recently partnered with a large steel manufacturer that had invested the past two years to build their own in-house development team and internal solution. Although they’re a tech-savvy enterprise, it didn’t take long to realize that building their own solution was an enormous challenge and a costly long term commitment. Enterprises must weigh the true upfront and long term costs to properly staff, design, build, and maintain an in-house solution versus licensing from an external vendor.
Most build challenges are related to talent, time, and costs. As most manufacturing companies do not have experts in software or machine learning, hiring is complicated. Recruiting in the tech space is very complex and is highly competitive. Being able to hire the right technical talent can often be the slowest and most costly hurdle to launching an in-house build. The time and skill set required of the recruitment team is very specialized in this space, and is outside of a typical enterprise recruiter’s network.
The costs and time commitment of building your own solution are significant. Costs range far beyond initial coding developments, which can take months. Once it’s developed, ongoing and constant maintenance is also needed. Even with the best team, new hires will be forced to rely on open-source technology, which is generic and not designed for specific requirements.
Instead, enterprises should lean on external solutions. They can alleviate many of the aforementioned challenges. Break even will occur faster with a well structured licensed solution. It greatly simplifies the maintenance process, too. When choosing a solution provider, be sure to seek a vendor that trains their models only on your proprietary data and doesn’t share learning from one customer to the next. This will ensure your data is safe and customized to your unique needs.”
A tiered approach to investing in LLM “copilots.” Commentary by Vaibhav Nivargi, co-founder and CTO of Moveworks
“The magic of large language model (LLM)-based “copilots” — like ChatGPT, GitHub Copilot, or Midjourney, for example — is that they use language, a highly intuitive user experience, to dramatically improve productivity for businesses. These copilots are emerging every day for a variety of different use cases, like copywriting, automated IT and HR support, code writing, and a myriad of others. But, the power of LLMs can feel out of reach for many who aren’t as familiar with them.
The truth is, LLMs can benefit any business of any size — the current stage of your business and the enterprise problem you’re trying to solve will ultimately determine which type, or “tier,” of copilot investment makes the most sense for you.
Are you searching for a low-cost solution that can handle actions like creating copy for a website or a sales call analysis? If so, a single API (tier 1 copilot) can deliver great results and with far less engineering resources needed than other copilots. Or, do you need a more comprehensive solution that will solve massive, complex problems across your entire organization (tier 4 copilot)? This requires the highest tier copilot strategy — which involves a combination of multiple LLMs, a handful of proprietary models, enterprise-grade security, permissions, and several other considerations to be successful.
There are numerous tiers of LLM copilots available that can simplify, streamline, and uplevel your business, but one thing is clear – using LLMs will absolutely benefit your business. Deciding which tier you’re willing and able to invest in will ultimately decide just how much it will benefit your business in the long run.”
Sign up for the free insideBIGDATA newsletter.
Join us on Twitter: https://twitter.com/InsideBigData1
Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/
Join us on Facebook: https://www.facebook.com/insideBIGDATANOW