• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Home
  • Contact Us

iHash

News and How to's

  • Apple MacBook Air MJVE2LLA (2015) 13.3" 1.6GHz i5 8GB RAM 128GB SSD (Refurbished) for $451

    Apple MacBook Air MJVE2LLA (2015) 13.3" 1.6GHz i5 8GB RAM 128GB SSD (Refurbished) for $451
  • 30W Slim Wall Charger White for $39

    30W Slim Wall Charger White for $39
  • Leather AirTag Case – Camo for $29

    Leather AirTag Case – Camo for $29
  • Microsoft Office Home and Business for Mac 2021 Lifetime License (MJVE2LLA Bundle) for $451

    Microsoft Office Home and Business for Mac 2021 Lifetime License (MJVE2LLA Bundle)
for $451
  • Microsoft Office Home and Business for Mac 2021 Lifetime License (MQD42LLA Bundle) for $475

    Microsoft Office Home and Business for Mac 2021 Lifetime License (MQD42LLA Bundle) for $475
  • News
    • Rumor
    • Design
    • Concept
    • WWDC
    • Security
    • BigData
  • Apps
    • Free Apps
    • OS X
    • iOS
    • iTunes
      • Music
      • Movie
      • Books
  • How to
    • OS X
      • OS X Mavericks
      • OS X Yosemite
      • Where Download OS X 10.9 Mavericks
    • iOS
      • iOS 7
      • iOS 8
      • iPhone Firmware
      • iPad Firmware
      • iPod touch
      • AppleTV Firmware
      • Where Download iOS 7 Beta
      • Jailbreak News
      • iOS 8 Beta/GM Download Links (mega links) and How to Upgrade
      • iPhone Recovery Mode
      • iPhone DFU Mode
      • How to Upgrade iOS 6 to iOS 7
      • How To Downgrade From iOS 7 Beta to iOS 6
    • Other
      • Disable Apple Remote Control
      • Pair Apple Remote Control
      • Unpair Apple Remote Control
  • Special Offers
  • Contact us

Why AutoML Isn’t Enough to Democratize Data Science 

Jan 30, 2023 by iHash Leave a Comment

You can cook food in a microwave in minutes. But we don’t say that microwaves “democratized” cooking.

Preparing a meal requires much more: selecting and preparing ingredients, optimizing the cooking method, and creating the right ambiance. The microwave just accelerates one part of the process.

Just as microwaves don’t handle the entire meal, automated machine learning (AutoML) only addresses a small portion of data scientists’ workflows. AutoML has become powerful and convenient. It’s a crucial step in the journey toward democratizing data science. However, there’s much more required to make data science accessible to all data professionals.

To truly democratize data science, we need to adopt automation across the entire data science workflow. Every step deserves to be addressed with robust, reliable automated tools that data analysts and business teams can use. Only then will we unlock the benefits of data science for all businesses.

What AutoML Does — and Why It’s Not Enough

AutoML typically handles model selection and hyperparameter tuning. A data professional using AutoML doesn’t need in-depth knowledge of algorithms and their use. Instead, an open-source AutoML library or a data science platform handles that part of the data science process. AutoML has become more accepted and trusted in recent years. 

But successful data science involves more than modeling. According to Anaconda’s latest State of Data Science report, model selection and training account for just 18% of data scientists’ time. In the meantime, they’re spending 47% of their time on data prep, cleansing, and deployment — tasks outside the scope of AutoML tools.

To be sure, AutoML is crucial to making data science more accessible. But if that’s the goal, why isn’t there more effort to automate these other time-consuming, critical tasks? 

Data Science’s Obsession With Modeling

The data science field has primarily focused on innovating with models. So far, automation has had that same narrow scope, mainly addressing model selection and hyperparameter optimization. Simply put, we’re obsessed with models. 

There are a few likely reasons for this fixation. First, data scientists love the intellectual challenge of modeling, which is the mathematical heart of data science. Mastery of algorithms also creates a high bar to entering the profession that preserves data scientists’ distinctive role and elite status. But that barrier doesn’t serve businesses’ interests. 

Furthermore, data science research has focused on developing new models and refining modeling strategies. As I’ve discussed elsewhere, innovations in modeling have revolved around natural language processing and computer vision, using more accessible datasets. However, tabular data — the form of most business data — has been neglected in research. New strategies for handling tabular data in the data science workflow could make a much broader impact, especially with automation.

Finally, the modeling obsession may stem from a belief that models are the only “universal” components of data science projects. In reality, as I’ll explore next, there’s more universality within data science projects than is usually assumed. That means there’s far more room for innovative automation to accelerate work on those universal elements.

Automating the Rest of the Data Science Process

To truly democratize data science, we need to automate more than modeling. We need to explore and acknowledge other universal components of the data science workflow and then automate them wherever possible. 

As we’ve discovered at Pecan (the AI company I co-founded), different companies carry out data science in similar ways. That starts with the fundamental questions they explore. Across the board, business teams tend to ask the same kinds of questions of their data. Which customers will likely churn in the next X days — and why? Who among our new customers will become a high-value customer or VIP? How can we personalize offers by anticipating which customers will be most likely to upgrade their services or buy complementary products? With these kinds of common concerns, we can standardize many questions and answer them successfully with automated methods that achieve remarkable business impact.

Not only are many businesses’ questions similar, but we also have found that their datasets relevant to those questions contain more commonalities than you might think. Companies tend to use the same kinds of data to address comparable challenges. Those similarities mean we can systematize and automate most data preparation and feature engineering.

With the right data for those recurring business questions, innovative tools can automatically identify and fix common data problems. Then, automated techniques can generate hundreds or thousands of features, transforming data in ways relevant to the business question. This automated approach casts a much wider net than selecting a few hand-crafted features and eliminates the impact of human biases on feature engineering and selection. Feature selection processes can then identify the most informative features and eliminate those that are less useful to prevent model overfitting and provide better model explainability.

With fully prepared data in hand, it’s time for modeling. Typically, it’s only at this stage that automation makes an appearance with AutoML. But AutoML provides better results with thoroughly prepared data. Savvy data scientists adopting the increasingly popular data-centric approach to AI recognize that better-prepared data improves model performance more than endless tinkering with the models themselves. 

Finally, model deployment must progress beyond today’s engineering-intensive approach. It’s widely acknowledged that few models successfully move into production. Anaconda’s survey data reveals the top barriers to deployment: IT/information security concerns, data connectivity, re-coding models from Python or R into other languages, and managing packages and dependencies. 

Making deployment secure and as seamless as possible can be accomplished by building connectors that feed models’ output into other business systems, as well as by automating model monitoring when models are in production. Model monitoring is critical, especially to watch for concept drift, which occurs when the target variable or outcome predicted by a model changes over time. Models need monitoring and maintenance for ongoing high performance. When handled manually, this process can be time-consuming, and it’s often neglected as a result. But fortunately, it’s now possible to automate model monitoring. Automating model deployment and monitoring helps make data scientists’ work useful and rewarding over the long term.

Achieving True Data Science Democratization

AutoML is integral to automating and democratizing data science. But on its own, it contends with just one step of a more complex undertaking. 

It’s tempting to celebrate the artisanship of a manual data science workflow. And with some use cases, a hand-coded approach is absolutely required. But we must acknowledge that other parts of data science work not only can but must be automated if data science’s benefits are to be realized more broadly in business. 

Even today, it’s already possible to automate the data science process as it’s applied most often to typical business challenges. The widespread nature of these challenges also means there’s incredible potential to take business outcomes to new heights with the broader adoption of automated data science. 

Embracing automation beyond AutoML will make data science truly accessible to all data professionals. Only then can all businesses realize the transformative benefits of democratized data science.

About the Author

Noam Brezis is the co-founder and CTO of Pecan AI, the leader in AI-based predictive analytics for business teams and the BI analysts who support them. Pecan enables companies to harness the full power of AI and predictive modeling without requiring any data scientists or data engineers on staff. Noam holds a PhD in computational neuroscience, an MS in cognitive psychology, and a BA in economics and psychology, all from Tel Aviv University.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Source link

Share this:

  • Facebook
  • Twitter
  • Pinterest
  • LinkedIn

Filed Under: BigData

Special Offers

  • Apple MacBook Air MJVE2LLA (2015) 13.3" 1.6GHz i5 8GB RAM 128GB SSD (Refurbished) for $451

    Apple MacBook Air MJVE2LLA (2015) 13.3" 1.6GHz i5 8GB RAM 128GB SSD (Refurbished) for $451
  • 30W Slim Wall Charger White for $39

    30W Slim Wall Charger White for $39
  • Leather AirTag Case – Camo for $29

    Leather AirTag Case – Camo for $29
  • Microsoft Office Home and Business for Mac 2021 Lifetime License (MJVE2LLA Bundle) for $451

    Microsoft Office Home and Business for Mac 2021 Lifetime License (MJVE2LLA Bundle)
for $451
  • Microsoft Office Home and Business for Mac 2021 Lifetime License (MQD42LLA Bundle) for $475

    Microsoft Office Home and Business for Mac 2021 Lifetime License (MQD42LLA Bundle) for $475

Reader Interactions

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

  • Facebook
  • GitHub
  • Instagram
  • Pinterest
  • Twitter
  • YouTube

More to See

30W Slim Wall Charger White for $39

Mar 22, 2023 By iHash

Leading With Business Integrity at the Intersection of Legal and Technology

Leading With Business Integrity at the Intersection of Legal and Technology

Mar 21, 2023 By iHash

Tags

* Apple Cisco computer security cyber attacks cyber crime cyber news cybersecurity Cyber Security cyber security news cyber security news today cyber security updates cyber threats cyber updates data breach data breaches google hacker hacker news Hackers hacking hacking news how to hack incident response information security iOS 7 iOS 8 iPhone Malware microsoft network security ransomware ransomware malware risk management Secure security security breaches security vulnerabilities software vulnerability the hacker news Threat update video Vulnerabilities web applications

Latest

Apple MacBook Air MJVE2LLA (2015) 13.3" 1.6GHz i5 8GB RAM 128GB SSD (Refurbished) for $451

Expires March 21, 2123 23:59 PST Buy now and get 43% off KEY FEATURES The Apple MacBook Air MJVE2LLA (2015) 13.3″ is a powerful and lightweight laptop that is perfect for people who are always on the go. The 13.3″ HD display provides crisp and clear images, so you can enjoy your favorite movies, TV […]

Leather AirTag Case – Camo for $29

Expires March 20, 2123 19:21 PST Buy now and get 14% off KEY FEATURES It’s all about tracking, not exposing. VogDUO AirTag Leather Case provides the best protection from privacy and damages for your personal belongings. For your best interests, we recommend the users keep the AirTag from exposure. Thus, we use Premium Italian Leather […]

Zero-click remote hacks for Samsung, Google, and Vivo smartphones

Zero-click remote hacks for Samsung, Google, and Vivo smartphones

Smartphones, tablets, and even cars with Samsung Exynos microprocessors are at risk of remote hacking. Bug hunters at Google Project Zero say you just need the victim’s phone number. This is due to the presence of 18 vulnerabilities in the Exynos baseband radio processor, which is widely used in Google, Vivo, Samsung, and many other […]

30W Slim Wall Charger Black for $39

Expires March 20, 2123 19:26 PST Buy now and get 20% off Slim Wall Charger 3-port Model No.: SPC001 Charger Pro Frequent travelers love to move around with minimal effort, which is why it makes perfect sense to carry a USB charger that can power up multiple devices simultaneously. Even better yet, if this particular […]

Heard on the Street – 3/20/2023

Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus […]

Evades Macro Security via OneNote Attachments

Mar 20, 2023Ravie LakshmananEndpoint Security / Email Security The notorious Emotet malware, in its return after a short hiatus, is now being distributed via Microsoft OneNote email attachments in an attempt to bypass macro-based security restrictions and compromise systems. Emotet, linked to a threat actor tracked as Gold Crestwood, Mummy Spider, or TA542, continues to […]

Jailbreak

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.2.0

Pangu has updated its jailbreak utility for iOS 9.0 to 9.0.2 with a fix for the manage storage bug and the latest version of Cydia. Change log V1.2.0 (2015-10-27) 1. Bundle latest Cydia with new Patcyh which fixed failure to open url scheme in MobileSafari 2. Fixed the bug that “preferences -> Storage&iCloud Usage -> […]

Apple Blocks Pangu Jailbreak Exploits With Release of iOS 9.1

Apple has blocked exploits used by the Pangu Jailbreak with the release of iOS 9.1. Pangu was able to jailbreak iOS 9.0 to 9.0.2; however, in Apple’s document on the security content of iOS 9.1, PanguTeam is credited with discovering two vulnerabilities that have been patched.

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.1.0

  Pangu has released an update to its jailbreak utility for iOS 9 that improves its reliability and success rate.   Change log V1.1.0 (2015-10-21) 1. Improve the success rate and reliability of jailbreak program for 64bit devices 2. Optimize backup process and improve jailbreak speed, and fix an issue that leads to fail to […]

Activator 1.9.6 Released With Support for iOS 9, 3D Touch

  Ryan Petrich has released Activator 1.9.6, an update to the centralized gesture, button, and shortcut manager, that brings support for iOS 9 and 3D Touch.

Copyright iHash.eu © 2023
We use cookies on this website. By using this site, you agree that we may store and access cookies on your device. Accept Read More
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT