Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!
AI Regulation. Commentary by from Frederik Mennes, Director of Product Management & Business Strategy at OneSpan
“The regulation of generative AI is necessary to prevent potential harm stemming from malicious applications, such as hate speech, targeted harassment, and disinformation. Although these challenges are not new, generative AI has significantly facilitated and accelerated their execution.
Companies should actively oversee the input data used for training generative AI models. Human reviewers, for instance, can eliminate images containing graphic violence. Tech companies should also offer generative AI as an online service, such as an API, to allow for the incorporation of safeguards, such as verifying input data prior to feeding it into the engine or reviewing the output before presenting it to users.
Additionally, companies must consistently monitor and control user behavior. One way to do this is by establishing limitations on user conduct through clear Terms of Service. For instance, OpenAI explicitly states that its tools should not be employed to generate specific categories of images and text. Furthermore, generative AI companies should employ algorithmic tools that identify potential malicious or prohibited usage. Repeat offenders can then be suspended accordingly.
While these steps can help manage risks, it is crucial to acknowledge that regulation and technical controls have inherent limitations. Motivated malicious actors are likely to seek ways to circumvent these measures, so upholding the integrity and safety of generative AI will be a constant effort in 2023 and beyond.”
What’s next for AI regulation? Commentary by Dr. Srinivas Mukkamala, Chief Product Officer, Ivanti
“Properly designed federal regulation acts as an enabler—not an inhibitor—to unlocking the magnificent power of AI to benefit us all. However, the power of the technology is not without its potential drawbacks. Generative AI like ChatGPT is coming to the public square and it is gaining significant momentum which introduces the possibility of misinformation being created and spread at machine speed. Furthermore, the wider the use of AI spreads, the more prominent the risk of perpetuating data, human and algorithmic bias. We need to evangelize the importance of responsible AI to practitioners and work collaboratively with policymakers to construct proper guardrails for the industry.”
Navigate your data landscape with data mapping. Commentary by Rachael Ormiston, Head of Privacy at Osano
“From proprietary company and customer information to financial numbers, most organizations are drowning in data. To successfully manage and secure all that data, privacy professionals are turning to data mapping. This process of connecting one source’s data field to another source’s data field allows you to understand and contextualize your entire data landscape by identifying what data you have, why you have it, where it’s coming from and who has access to it.
A comprehensive overview of your data landscape facilitates data management and analysis, allowing you to glean insights and help with decision-making. Data mapping also makes it easier to ensure you’re complying with data privacy regulations, laws and security requirements by giving you better visbility to assess the risks associated with the data you have. As privacy professionals continue improving the consistency of how they operationalize their data privacy programs, data mapping will be invaluable for managing data across its entire lifecycle.”
ChatGPT owner nears record 1bn unique users per month. Commentary by Stefan Katanic, CEO of Veza Digital
“The ChatGPT phenomenon spread like wildfire at the end of 2022 and we expect it to soon break all records of being the fastest ever website to reach 1 billion monthly active users in such an incredibly short space of time. This is indicative of a clear public interest in AI-powered solutions, which legislators are rushing to regulate before it spirals into unchartered territories, like artwork copyright and ethical challenges. Debates about AI are divisive, but one thing we can probably all agree on is that AI is no longer the future – it is the present.
We believe that AI will play a big role in over 50% of businesses in the next five years, as such we are even looking to embrace this technology advancements in our daily operations as well as strategically geo-positioning of our company.”
Addressing the Security Implications and Concerns of ChatGPT. Commentary by Jerald Dawkins, PH.D., Chief Technology Officer, CISO Global
“It’s true, ChatGPT comes with risks – just like all new technology does. Do we embrace the fear and shut down workplace innovation? If so, we also lose the ability to help our teams work better, faster. If we want to enable people to leverage technology to work smarter, what we need to do is understand how these tools work, think through their use cases, define risks, and put some protections in place that allow them to be used wisely. Once you understand that ChatGPT is designed to understand vast amounts of data quickly, and it uses all the data you give it as part of its cache. Now, think about problems people might want to solve with a quick, accurate search functionality. Dev ops teams might want suggestions for their code (see Samsung). Your IT team might want help creating a software rollout plan that doesn’t miss steps. You get the idea. Ask yourself – is the information my teams would want to feed into this tool something that can be shared publicly? Is the information coming out trustworthy? How can I ensure we allow for cases where the answer is “yes”, and how do we mitigate the ones where the answer is “no”?
Now let’s think about the risks of using a “large language model” open AI tool. A cyber attacker half-way around the world could use chat AI to write better phishing emails. Executives give public speeches and publish articles online regularly, leaving transcripts and records of their typical wording style, tone, and more. I asked ChatGPT to write me an email request for an invoice in the tone of JFK, Sr., and the results were shockingly accurate. So, without any social engineering or language lessons, a bad actor could create a pretty convincing email that sounds like your executive, requesting that teams take an action or click a malicious link. In another use case, disinformation could be fed into the tool to train it on biased or malicious data, increasing the risk of untrustworthy outputs. My recommendation for companies evaluating the tool is not to make blanket policies that disallow ChatGPT, but to proactively review and understand it, your users, build security and privacy controls around sensitive corporate data, and make sure people know how to validate the answers they are getting. Then you have benefits of AI in the workplace, but you’ve mitigated risk.”
Humans need to be hands on every step of the way when it comes to AI. Commentary by Hanjo Kim, SVP of Global Strategy and Head of Medicinal Chemistry at Standigm
“It is vital to recognize the importance of human intervention in using AI tools like ChatGPT. We use a phrase called “human-in-the-loop” to describe an automated system dependent on human input and supervision. I think this concept is more critical to the narrow AI models such as generative chemistry models.
AI tools are only as powerful as the data that feeds them and the humans that guide them, and when it comes to acquiring the best data possible in the proper contexts, humans are still very much the experts. Take our work at Standigm, for example, where we combine the expertise of science and tech professionals with powerful algorithms to help them sort through millions of pieces of data that help generate new drug compound designs. As data for generative chemistry models will never be enough like language or image models, this situation will last a long time.”
From Big to Small: The Shift in Data Management for AI/ML. Commentary by Justin Cobbett, Product Marketing Manager, Akamai
“AI has the potential to revolutionize industries across the board, but to make these technologies work, they need data – and lots of it. The traditional management of Big Data through building and organizing large, complex datasets, can be both challenging and resource-intensive to maintain on an ongoing basis. Technology has advanced to help organizations better handle the influx of massive sets of unstructured data, but turning those huge blocks of information into insights is easier said than done. Data has become the de facto currency for technology, but the often ridiculous volume and variety of Big Data means organizations can end up with significant resource expenditure far beyond what is required to create a practical model.
By contrast, companies are starting to turn to “Small and Wide” data management – relying on a greater variety of data sources to find correlations, rather than focusing on combing through large, consolidated mountains of data. This reduces the need for extensive computing resources, instead highlighting the importance of data variety to ensure that insights are representative. Because it is quicker and easier to manage, small and wide data management adapt faster to changing trends and behaviors – making it a winning strategy to manage the quickly evolving large language models that power modern generative AI. Additionally, training AI models with smaller datasets, more diverse data sets improves accuracy.
While large datasets still have their uses, optimizing models to make the best use of data and focusing on data variety is becoming just as important. This transition opens up opportunities for organizations to derive meaningful insights and stay ahead in data-driven decision-making.”
Microsoft Copilot – Balancing AI Capabilities and Human Ingenuity. Commentary by Nick White, Data Strategy Director, at Kin + Carta
“Copilot’s integration of AI capabilities directly into Microsoft’s most popular applications could be a business game-changer. It’s consistent with Microsoft’s role in the digital age up until today, making work easier with technology as it did with personal computers and the Office suite – bringing those tools out of specialised environments and into people’s day-to-day lives. But like any tool, it must be used properly.
Copilot relies heavily on human ingenuity and common sense to prevent misinformation and the exposure of sensitive information. Businesses must therefore ensure they have adequate policies and education to avoid these pitfalls. And before even using the technology, ask some questions. “Is the platform right for the use?’ ‘What is it good at and how does it do it?’ ‘Is it ethical to use in this situation?’
Like any tool, the value you gain from using it is directly linked to how you wield it. Crudely, Copilot is like a hammer and nail; if you use it carelessly, you can hurt yourself (and indeed others), but if used sensibly, you can build something truly impressive.”
Realizing AI ROI starts with more meaningful integrations. Commentary by Daniel Fallman, Founder and CEO at Mindbreeze
“AI has opened the door for many innovative customer experience approaches. While chatbots were, in many ways, a first try aimed at enhancing the way consumers communicate with businesses, AI is doing a lot more than making them smarter. Additionally, AI is and will continue to offer actual support workers with tools to assist those still seeking resolutions post-chatbot interactions.
AI-powered systems allow customer service professionals to find relevant information and help the customer at a faster rate than we have seen before. With machine learning techniques like “Data Extraction” and “Natural Language Question Answering (NLQA),” personalized responses can be generated and relayed back to the customer for highly efficient troubleshooting. Cross-departmental data connection will reinvent the customer support process by allowing workers to see where their information came from. AI is the foundation of “virtual personal assistants,” which will become more widespread for quick-solution finding, less busy work, and, most importantly, paving the way for a more satisfied customer base.”
Improving data wins the generative AI race. Commentary by Gordon, SVP of Strategy at Mendix
“Generative AI, with its foundational models comprising massive amounts of data and billions of parameters, is driving a similarly massive level of interest due to its ability to combine and re-contextualize existing content or knowledge to create something startlingly compelling and possibly useful. In the right hands and with the right prompts, it can accelerate the design and development of everything from marketing newsletters to molecules, chemical compounds, and spacecraft.
But it’s important to remember that generative AI doesn’t understand, learn, or reason. It only synthesizes. Reasoning requires other forms of analytics and AI – and indeed people. Infusing human judgment, for example through the validation of AI-generated recommendations, creates the necessary feedback loop to improve AI. It’s through pairing AI with people that we can safely augment humanity to create a more brilliant future, while automating the routine and mundane.”
Addressing healthcare chatbot failure. Commentary by Ivan Ostojić, Chief Business Officer, Infobip
“Utilizing untested AI technology, in this case, chatbots is risky when trusted to communicate with patients and provide medical advice. There needs to be a specific approval process and clear safeguards in place for the usage of this type of technology to ensure AI chatbots can’t intervene without human supervision. It is also critical to establish workflows that these types of chatbots are developed to follow where items pertaining to highly sensitive topics are brought to the attention of humans for review and approval before sharing information with a patient.
AI algorithms for virtual assistants and chatbots need to be developed and trained with ethical considerations in mind. They should be unbiased, inclusive, and avoid perpetuating stereotypes or discrimination, and it’s critical they are tested to operate in this manner before being used for patient engagement. As we continue to see an increase in generative AI adoption, it’s crucial that these tools are constantly monitored for these types of sensitivity flaws, and when providing medical advice are tested to provide appropriate responses.”
Why organizations need data collaboration technologies to bring the data mesh vision to life. Commentary by Dan DeMers, co-founder and CEO of Cinchy
“When any emerging technology gets a lot of hype, it’s smart to be skeptical. But data mesh defies that stereotype: This discipline keeps the focus on the data itself, rather than the technologies used to create and store it. That’s a major advance in our collective journey to a data-centric culture. Data mesh highlights domain-based ownership, with decentralization to better meet the needs of diverse business constituencies. This represents a clear departure from obsolete best practices around data guardianship and zealous hoarding. It points toward th establishment of data as its own network, and enhances effective governance.
However, as far as data mesh goes, it arguably doesn’t go far enough; what we need is a more fundamental restructuring of the traditional data ecosystem. For that, we need new tools like data collaboration technology that decouple the data from related technologies. This will allow truly federated computational governance; and ensure that wherever the data travels, the permission, controls, policies and more are always consistent. Most importantly, this will eliminate data silos and lead to data that’s integrated without laborious and costly data integration.”
Unleashing the Power of Generative AI: Transforming Marketing in the Digital Era. Commentary by Ajay Yadav, Co-founder of Simplified
“As the field of marketing continues to evolve, one of the most transformative advancements we’ve witnessed is the harnessing of the power of generative AI. This technology has revolutionized the way businesses engage with their target audiences, enabling them to create personalized and compelling marketing content at an unprecedented scale. With generative AI, marketers can now automate the generation of dynamic and tailored campaigns, reducing the time and resources required to create content while enhancing its effectiveness. Moreover, this technology empowers marketers to tap into vast amounts of data to gain deep insights into consumer behavior, preferences, and trends. By analyzing this data, businesses can make informed decisions, optimize their marketing strategies, and deliver highly targeted messages to the right audience at the right time.
In recent years, we have seen a significant shift in the marketing landscape, driven by the increasing adoption of generative AI. One notable trend is the rise of hyper-personalization. Today’s consumers expect brands to deliver relevant and personalized experiences, and generative AI enables marketers to meet these expectations. By leveraging AI algorithms and machine learning, businesses can analyze customer data and create tailored marketing campaigns that resonate with individual preferences and needs. This level of personalization not only improves customer engagement but also fosters brand loyalty and drives conversions. Additionally, generative AI empowers marketers to explore new frontiers of creativity and experimentation. With the ability to generate content variations rapidly, businesses can test different messaging, visuals, and formats, allowing them to optimize their campaigns and deliver the most impactful marketing materials. The evolution of marketing through generative AI is truly reshaping the industry, offering endless possibilities for businesses to connect with their audiences on a deeper level.”
Machine learning can boost M&A efficiency. Commentary by Dana Pasquali, VP of Product Management, Vertafore
“I’m seeing buyers approach the mergers and acquisitions (M&A) landscape cautiously as they look for long-term potential to maximize their investment in a bumpy economy. To catch an investor’s eye, companies need to leverage their technology—and the data that comes from it—to show a crystal-clear picture of the historical and current value of a business. AI and machine learning are key tools in this process to use data-backed predictions to tell the story of where the company is headed, generate more revenue, and meet goals. With its ability to analyze vast amounts of data quickly and accurately, machine learning provides business owners and investors alike with a variety of ways to understand risk, claims, and customer behavior while providing valuable insights into market trends, company performance, and potential synergies.
According to industry reports, machine learning has significantly improved the accuracy of deal valuation models, resulting in more informed decision-making and reduced transaction risks. With the M&A landscape becoming increasingly competitive, the integration of machine learning technologies is no longer a luxury but a necessity for firms looking to stay ahead in this dynamic market. If you’re in the insurance industry, now is a good time to shore up (or create) a solid data governance plan for your agency management system to ensure accurate data is at the core of your business.”
On ChatGPT, Chatbots and Humanizing AI. Tiago Cardoso, product manager at Hyland Software
“Affective Computing, Virtual Agents and Human-Robot Interaction are all mature field in AI research. Providing a face to an AI agent can provide empathy and emotional connection but are not a necessary thing for a productive socialization with something like ChatGPT. Social media and gaming has shown us that people can relate with abstract human appearances. On the other hand, it would be extremely difficult and maybe even unproductive to provide concrete human characteristics to an experience that aggregates most textual human knowledge and is a mirror to the human race, not a particular concrete human.
Depending on the context, ChatGPT will have different personalities, which might break the human face illusion.
The level of human-like interaction and knowledge as well as how these new chatbots start to tap into creativity and complex thoughts and how fast they can reply make them seem superhuman. This can be intimidating. When you add the generative hallucination effect where AI chatbots will provide knowledge that is not part of the reality and false but in an extremely coherent way and with confidence, people get the perception that they can actually be dangerous. There is a lot of work to be done in chatbot security and safety in order to provide an experience that people could relate and feel empathic to.
Apart from long-standing fields like Affective Computing, Virtual Agents and Human-Robot Interaction, a solution is to research new techniques that will improve the chatbot learning on how to understand nuanced communication, better perception of expectations and how to generate empathy and trust. We can expect more development on this end using reinforcement learning being able to score this (now, very abstract) metrics. Although we can try to learn ways to improve the robot-human relation, it will really need to be improved by ML to get real tangible results.”
Reddit, APIs and the future of public data. Commentary by Or Lenchner, CEO of Bright Data
“Public web data needs to remain in the public domain. The knowledge we gain from it improves lives and reshapes industries from healthcare to finance. If tech companies are allowed to build a walled garden around data that’s in the public domain, it will prevent AI from reaching its full potential. Moreover, stripping publicly available web data from the public will make it harder for A.I. to advance in a way that benefits society.
Tech companies that hold the keys to public web data must be held accountable by their users. Public web data remaining public is not only necessary for the development of A.I., it is crucial for e-commerce, academic studies, and research for social good.”
Sign up for the free insideBIGDATA newsletter.
Join us on Twitter: https://twitter.com/InsideBigData1
Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/
Join us on Facebook: https://www.facebook.com/insideBIGDATANOW