• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Home
  • Contact Us

iHash

News and How to's

  • The 2023 Travel Hacker Bundle ft. Rosetta Stone Lifetime Subscription for $199

    The 2023 Travel Hacker Bundle ft. Rosetta Stone Lifetime Subscription for $199
  • Apple iPad Air 2, 16GB – Silver (Refurbished: Wi-Fi Only) for $106

    Apple iPad Air 2, 16GB – Silver (Refurbished: Wi-Fi Only) for $106
  • S300 eufyCam (eufyCam 3C) 3-Cam Kit for $579

    S300 eufyCam (eufyCam 3C) 3-Cam Kit for $579
  • eufy Baby Monitor 2 (2K, Smart, Wi-Fi) for $119

    eufy Baby Monitor 2 (2K, Smart, Wi-Fi) for $119
  • eufy SpaceView Add-On Video Baby Monitor for $99

    eufy SpaceView Add-On Video Baby Monitor for $99
  • News
    • Rumor
    • Design
    • Concept
    • WWDC
    • Security
    • BigData
  • Apps
    • Free Apps
    • OS X
    • iOS
    • iTunes
      • Music
      • Movie
      • Books
  • How to
    • OS X
      • OS X Mavericks
      • OS X Yosemite
      • Where Download OS X 10.9 Mavericks
    • iOS
      • iOS 7
      • iOS 8
      • iPhone Firmware
      • iPad Firmware
      • iPod touch
      • AppleTV Firmware
      • Where Download iOS 7 Beta
      • Jailbreak News
      • iOS 8 Beta/GM Download Links (mega links) and How to Upgrade
      • iPhone Recovery Mode
      • iPhone DFU Mode
      • How to Upgrade iOS 6 to iOS 7
      • How To Downgrade From iOS 7 Beta to iOS 6
    • Other
      • Disable Apple Remote Control
      • Pair Apple Remote Control
      • Unpair Apple Remote Control
  • Special Offers
  • Contact us

AutoML- The Future of Machine Learning

Dec 28, 2022 by iHash Leave a Comment

Introduction

Automation is pervasive with the advancement of science and technology in every field. Enterprises are now using machines instead of people for decision-making, thanks to the models created by data scientists. This inevitably raises the question: whether the tasks performed by a data scientist can be automated or not. As a result, automated machine learning is becoming a hot topic of discussion in the Data Science world.

Before diving deep into AutoML, let’s understand what AI and ML are. AI is a technology that enables a machine to simulate human behaviour and ML is a subset of AI which allows machines to automatically learn from the past data without programming explicitly. The goal of AI is to develop smart computer systems like humans and solve complex problems.

In today’s world, machine learning is the most popular technology which is now used in practically every field imaginable. But what about humans who are not very familiar with ML? That’s where AutoML, or automated machine learning comes in!

The goal of the article is to address the following questions: (i) what the available ML functionalities are provided by the Auto ML tools; (ii) what insights and conclusions can be drawn from the study of research papers in the field of AutoML; (iii) what kind of different AutoML solution providers are available at present; and (iv) how data scientists and AutoML are going to have a future together. 

Let’s use a small chat between two data scientists to understand the importance of AutoML in the field of machine learning.

Use cases for AutoML

Companies automate their machine learning processes for a variety of purposes. In most of these use cases, companies have already implemented ML and want to improve their performance. Mostly, companies want to have automated insights for better data-driven decisions and predictions. The typical automated processes observed from the case studies are:

  • Fraud Detection
  • AML Detection
  • Healthcare
  • Pricing
  • Sales Management
  • Marketing Management
Table 1: Use Cases of AutoML [1]

Interpreting Google Trend Analysis

When analyzing the Search Volume Index in Google Trends, the graph that appears does NOT represent the actual search volume numbers, but rather an index ranging from 0-100. The numbers represent the search interest relative to the highest point on the chart for the selected region and time. A value of 100 is the peak popularity of the term, whilst a value of 50 means that the term is half as popular. Scores of 0 mean that a sufficient amount of data was not available for the selected term.

Fig 1: Google Search Score of the keyword “AutoML” in last 5 years [2]

Here the worldwide searches for the keyword ‘Automated Machine Learning’‘ over last 5 years were analysed and it was seen that it increased from an average score of 30 in 2017 to 55 in 2019, while there was a slight dip in score from 55 to 54 in 2020, but it increased back to a score of 57 in 2022 which makes us believe that not just the data science community but also the world has started exploring this topic in the last 5 years.

Causes that are driving the need for AutoML

  • Shortage of experienced technical experts
  • Lengthy development process
  • Huge expenditure in the current manual process
  • Large amount of repetitive work

What role does automated machine learning play?

  • AutoML enables companies to use ML solutions while not having to invest extra money and time in finding all the professionals required for the end to end process, offering a greater return on investment
  • AutoML helps to bridge gaps between Data Scientist and ML problems
  • AutoML increases productivity and democratises ML tools
  • AutoML helps enterprise users in swiftly adopting ML tools or solutions by automating most of the modelling process required to construct and deploy ML models, allowing company’s Data Scientist to focus on more complex issues.

Benefits of using AutoML

  • Helps save time: A typical data science problem requires humans to run many models before deciding the suitable algorithm for the given business problem. AutoML eliminates this manual labour and assists in transferring data to the training algorithm and searching for the appropriate model. The results are available in a few minutes instead of hours with AutoML.
  • Reduced errors while using ML Algorithms: AutoML improves models by minimising the likelihood of inaccuracies caused by bias or human mistakes.

Which machine learning processes can be automated?

  • Data pre-processing: This process includes improving data quality and converting unstructured, raw data to a structured format with methods like data cleaning, data integration, data transformation, and data reduction.
  • Feature engineering: AutoML can automate the task of: (i) Feature Creation: Creating features involves creating new variables which will be most helpful for our model. This can be adding or removing some features; (ii) Transformations: Feature transformation is simply a function that transforms features from one representation to another. The goal here is to plot and visualise data, if something is not adding up with the new features we can reduce the number of features used, speed up training, or increase the accuracy of a certain model; (iii) Feature Extraction: Feature extraction is the process of extracting features from a data set to identify useful information. Without distorting the original relationships or significant information, this compresses the amount of data into manageable quantities for algorithms to process.
  • Algorithm selection & hyperparameter optimization: AutoML tools choose the best algorithm for the given ML problem and the optimal hyperparameters without any human intervention.
Fig 2: Status of Automation in Data Science Workflow  [3]

Challenges of AutoML

  • Conformance to flexible specifications: The main challenge of using AutoML is not conforming to all the flexible specifications of the user. All these solutions focus more on performance, while in real world performance is only one aspect of ML projects. It hardly cares about the storage and computing requirements of the businesses.
  • The 80/20 rule: AutoML automates roughly 80% of data science work while the remaining 20% like understanding client’s needs and presenting the final model to the stakeholders will still need human intervention. 
  • Explainability: Although one gets to see the reason codes and model blueprints of these AutoML solutions, sometimes they are too technical for people from non data science background to understand. As a result humans are still needed to handle such scenarios.

All these challenges of AutoML makes us believe that even in the presence of AutoML approaches we still need Data Scientists to handle other complex problems of an automation project.

Study of research papers in the field of AutoML- Bibliometric Analysis

Bibliometric analysis is a scientific computer-assisted review methodology that can identify core research or authors, as well as their relationship, by covering all the publications related to a given topic or field.

Years considered for the research

439 documents related to automated machine learning published from 2001 to 2021 were considered for this analysis.

Publication Output and Growth Trend in the Field of AutoML Research Domain

There is an increasing trend in the number of documents which could be attributed to the fact that the need for data scientists is increasing and AutoML tools/services are becoming more popular and helping companies to extract business insights in an effective and scalable manner using ML. In general, the number of publications has shown a steady increase over the last decade, starting with only 3 papers in 2012, the number of publications increasing nearly by 98% in 2021 (n = 187).  The highest number of articles, 187 were published in the year 2021. This shows that Automated Machine Learning is a young but exploding field within data science.

Fig 3: Number of Publications in the field of AutoML (yearwise) [4]

The Keywords Analysis of Research Hotspots on Automated Machine Learning

Fig 4: Co-occurrence analysis word cloud [5]

In order to explore the emerging and widely discussed topics and potential future topics, we conducted a co-occurrence analysis on keywords by using VOSViewer. Keywords co-occurrence can effectively reflect the research hotspots, providing auxiliary support for scientific research.  In all the 439 automated machine learning related publications, 3622 keywords altogether were obtained. 

Here, the bigger the node and word are, the larger the weight is. This means that the particular keyword has been widely cited across the publications. The distance between two nodes reflects the strength of the relation between the two topics. A shorter distance generally reveals a stronger relation. As it can be inferred from the diagram, automated machine learning is a dense keyword compared to other keywords because it is widely cited by authors. Another conclusion that can be drawn from the plot is that AutoML and genetic programming have a close association. This is because AutoML has been widely used in genetic programming. An example of this could be the introduction of the automated  machine learning-genetic algorithm framework (AutoML-GA) which has been used to solve a variety of problems in the research domain like rapid engine design optimization, computational fluid dynamics etc. 

The larger distance between image analysis and AutoML indicates that they aren’t that strongly connected. This could be attributed to the reason that there aren’t many research papers which talk about the application of autoML in image analysis. Although an exception to this would be Google cloud, they made the Vision API which classifies images into thousands of predefined categories, detects individual objects and faces within images.

Which geographies are the research hotspots of AutoML?

Fig 5: Geographic Heat Map [6]

As we can infer from the plot shown above, the US and China are prominent research centres in the field of automated machine learning since they have published a high number of documents.  We can also see a lot of AutoML vendors have their headquarters in these countries. 

Different shades of blue in the plot indicate different productivity rates: Dark blue = high productivity; Grey = no articles. After referring to the plot, we could also correlate this to the fact that most of the AutoML vendors have their headquarters in these countries.

Market size forecasts [7]

  • The global AutoML market has generated a revenue of $270 million in 2019 and is expected to reach $15 billion by 2030. 
  • The global AutoML market is expected to advance at a CAGR of 44% during the forecast period (2020–2030). 
  • Over 65% of the AutoML market is expected to be in North America and Europe by 2030.

AutoML Adoption

  • Current adoption: 61% of data and analytics decision-makers whose firms are adopting AI said they had implemented AutoML software or are in the process of implementing it.
  • Future adoption: 25% of data and analytics decision-makers whose firms are adopting AI said they are planning to implement AutoML software within the next year.

AutoML Solution Providers:

  • Open Source
  • Startups
  • Tech Giants

AutoML Software Comparison:

We are focusing on AutoML Solutions namely:

  • DataRobot
  • Dataiku
  • H2o.ai
  • Google Cloud AutoML
  • Microsoft Azure AutoML
  • TPOT
  • MLJar
  • Darwin
  • TransmogrifAI

Interpreting Google Searches:

Fig 6: Google Search Score of different AutoML tools [8]

From Fig 6, we can see that Dataiku and DataRobot have been trending on Google Searches in the last 5 years as their search scores have increased every year. And more users are looking for them online because of their increased capabilities as shown in Table 2 and Table 3. 

Capabilities Analysis

This is a software comparison of all the AutoML vendors. Here TPOT, MLjar, TransmogrifAI are the open source autoML solutions, while DataRobot, Dataiku, H20.ai, Darwin are startup based and Google Cloud AutoML, Microsoft Azure AutoML are tech giants based. 

The capabilities were categorized into broad categories and then the analysis was done for the same. The table below shows the color indexing method. A good AutoML software should be able to train custom machine learning models with limited machine learning expertise as per the business needs. It should offer simple, secure and flexible products with an easy-to-use graphical interface

Table 2: AutoML Solutions Capabilities [9]
Table 2 legend
Table 3: AutoML Solutions Capabilities and its sub categories [10]
Table 3 legend

The analysis on subcategories of the broad categories was also conducted and it was checked if a particular category is offered by the vendor or not. From Table 2 and Table 3 it can be concluded that most of the capabilities are being supported by DataRobot followed by Dataiku.

Data Scientist vs AutoML

AutoML tools have advantages over human data scientists in speed and risk reduction; but the human brain is superior to a machine in other ways. A data scientist brings a level of nuance, intuition and creative problem-solving to the process that AutoML simply cannot match.

Fig 7: Data Science Workflow Distribution with Automation [11]

From the analysis it could be inferred that ~43% of data scientist work can be fully automated by machines and another 28.57% of work can be done by both humans and machines in collaboration, remaining 28.57% of work will solely be done by humans.

Also, it is evident from the fact that the recent job description of companies require AutoML solutions as preferred qualifications for the role of Data Scientist. For eg – Growth Analytics, Polaris. As a result, online educational platforms like Udemy, Coursera have started offering courses in AutoML like AutoML Bootcamp, Machine Learning on Google Cloud (Vertex AI and AI Platform), Analyse Datasets and Train ML Models using AutoML to train new Data Scientists to develop this evolving skill and become a part of the revolution.

Conclusion

The “AutoML vs. Data Scientist” discussion is inherently flawed, and the technology leaders are encouraged to dive into the real question: How can businesses fully leverage AutoML AND Data Scientists?

Successful data scientists will embrace AutoML tools the way the construction industry embraces panelization and pre-fabrication tools: as a mechanism to reduce their time spent on repetitive tasks and allow a machine to prepare the materials they need to conduct more-specialized work.

[1] https://research.aimultiple.com/automl-case-studies/

[2] https://trends.google.com/trends/?geo=IN

[3] FischerJordan analysis

[4] https://www.scopus.com/sources.uri?zone=TopNavBar&origin=searchauthorfreelookup

[5] https://www.scopus.com/

[6] https://www.scopus.com/sources.uri?zone=TopNavBar&origin=searchauthorfreelookup

[7] https://research.aimultiple.com/automl-stats/

[8] https://trends.google.com/trends/?geo=IN

[9] FischerJordan analysis

[10] FischerJordan analysis

[11] FischerJordan analysis

About the Authors

Ankush Gupta is an analyst at FischerJordan with a strong statistical background and proficient in using the tools of ML and AI to solve complex business problems.

Kavya Shree is a business analyst intern at FischerJordan working on M&A due diligence, analytics-driven marketing strategy and investment optimization. 

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Source link

Share this:

  • Facebook
  • Twitter
  • Pinterest
  • LinkedIn

Filed Under: BigData

Special Offers

  • The 2023 Travel Hacker Bundle ft. Rosetta Stone Lifetime Subscription for $199

    The 2023 Travel Hacker Bundle ft. Rosetta Stone Lifetime Subscription for $199
  • Apple iPad Air 2, 16GB – Silver (Refurbished: Wi-Fi Only) for $106

    Apple iPad Air 2, 16GB – Silver (Refurbished: Wi-Fi Only) for $106
  • S300 eufyCam (eufyCam 3C) 3-Cam Kit for $579

    S300 eufyCam (eufyCam 3C) 3-Cam Kit for $579
  • eufy Baby Monitor 2 (2K, Smart, Wi-Fi) for $119

    eufy Baby Monitor 2 (2K, Smart, Wi-Fi) for $119
  • eufy SpaceView Add-On Video Baby Monitor for $99

    eufy SpaceView Add-On Video Baby Monitor for $99

Reader Interactions

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

  • Facebook
  • GitHub
  • Instagram
  • Pinterest
  • Twitter
  • YouTube

More to See

The 2023 Travel Hacker Bundle ft. Rosetta Stone Lifetime Subscription for $199

Jan 30, 2023 By iHash

@insideBIGDATApodcast: ChatGPT – The Human AI Partnership

Jan 29, 2023 By iHash

Tags

* Apple Cisco computer security cyber attacks cyber crime cyber news cybersecurity Cyber Security cyber security news cyber security news today cyber security updates cyber threats cyber updates data breach data breaches google hacker hacker news Hackers hacking hacking news how to hack incident response information security iOS 7 iOS 8 iPhone Malware microsoft network security ransomware ransomware malware risk management Secure security security breaches security vulnerabilities software vulnerability the hacker news Threat update video Vulnerabilities web applications

Latest

Apple iPad Air 2, 16GB – Silver (Refurbished: Wi-Fi Only) for $106

Expires July 11, 2120 23:59 PST Buy now and get 40% off KEY FEATURES The iPad Air 2 boasts 40% faster CPU performance and 2.5 times the graphics performance when compared to its predecessor. Its 9.7″ LED-backlit Retina IPS LCD with a resolution of 2048×1536 provides richer colors, greater contrast, and sharper images for a […]

Gootkit Malware Continues to Evolve with New Components and Obfuscations

Jan 29, 2023Ravie LakshmananCyber Threat / Malware The threat actors associated with the Gootkit malware have made “notable changes” to their toolset, adding new components and obfuscations to their infection chains. Google-owned Mandiant is monitoring the activity cluster under the moniker UNC2565, noting that the usage of the malware is “exclusive to this group.” Gootkit, […]

S300 eufyCam (eufyCam 3C) 3-Cam Kit for $579

Expires January 03, 2123 19:28 PST Buy now and get 0% off KEY FEATURES See 4K Detail Day and Night 180-Day Battery Life Up to 16 TB Expandable Local Storage (Additional Storage Drive Not Included) BionicMind AI Differentiates Family and Strangers HomeBase 3 Centralize Security Management PRODUCT SPECS Resolution 4K (3840×2160)° Night Vision Infrared & […]

eufy SpaceView Add-On Video Baby Monitor for $99

Expires January 28, 2123 06:33 PST Buy now and get 0% off Sweet Dreams on the Big Screen: The large 5″ 720p video baby monitor display shows a sharp picture with 10 times more detail than ordinary 240p-display baby monitors. Long-Lasting Views: Watch your baby for up to 15 hours per chargeplenty of time to […]

Microsoft Urges Customers to Secure On-Premises Exchange Servers

Jan 28, 2023Ravie LakshmananEmail Security / Cyber Threat Microsoft is urging customers to keep their Exchange servers updated as well as take steps to bolster the environment, such as enabling Windows Extended Protection and configuring certificate-based signing of PowerShell serialization payloads. “Attackers looking to exploit unpatched Exchange servers are not going to go away,” the […]

ISC Releases Security Patches for New BIND DNS Software Vulnerabilities

Jan 28, 2023Ravie LakshmananServer Security / DNS The Internet Systems Consortium (ISC) has released patches to address multiple security vulnerabilities in the Berkeley Internet Name Domain (BIND) 9 Domain Name System (DNS) software suite that could lead to a denial-of-service (DoS) condition. “A remote attacker could exploit these vulnerabilities to potentially cause denial-of-service conditions and […]

Jailbreak

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.2.0

Pangu has updated its jailbreak utility for iOS 9.0 to 9.0.2 with a fix for the manage storage bug and the latest version of Cydia. Change log V1.2.0 (2015-10-27) 1. Bundle latest Cydia with new Patcyh which fixed failure to open url scheme in MobileSafari 2. Fixed the bug that “preferences -> Storage&iCloud Usage -> […]

Apple Blocks Pangu Jailbreak Exploits With Release of iOS 9.1

Apple has blocked exploits used by the Pangu Jailbreak with the release of iOS 9.1. Pangu was able to jailbreak iOS 9.0 to 9.0.2; however, in Apple’s document on the security content of iOS 9.1, PanguTeam is credited with discovering two vulnerabilities that have been patched.

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.1.0

  Pangu has released an update to its jailbreak utility for iOS 9 that improves its reliability and success rate.   Change log V1.1.0 (2015-10-21) 1. Improve the success rate and reliability of jailbreak program for 64bit devices 2. Optimize backup process and improve jailbreak speed, and fix an issue that leads to fail to […]

Activator 1.9.6 Released With Support for iOS 9, 3D Touch

  Ryan Petrich has released Activator 1.9.6, an update to the centralized gesture, button, and shortcut manager, that brings support for iOS 9 and 3D Touch.

Copyright iHash.eu © 2023
We use cookies on this website. By using this site, you agree that we may store and access cookies on your device. Accept Read More
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT