• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Home
  • Contact Us

iHash

News and How to's

  • The 2024 Complete Presentation & Public Speaking Bundle for $24

    The 2024 Complete Presentation & Public Speaking Bundle for $24
  • Apple iPhone XS Max (A1921) 64GB – Gold (Grade A+ Refurbished: Wi-Fi + Unlocked) for $349

    Apple iPhone XS Max (A1921) 64GB – Gold (Grade A+ Refurbished: Wi-Fi + Unlocked)  for $349
  • Apple iPhone XR (A1984) 256GB – White (Grade A+ Refurbished: Wi-Fi + Unlocked) for $329

    Apple iPhone XR (A1984) 256GB  – White (Grade A+ Refurbished: Wi-Fi + Unlocked) for $329
  • The 2024 Google Sheets Formulas & Automation Bundle for $39

    The 2024 Google Sheets Formulas & Automation Bundle for $39
  • MEAZOR 3D Laser Measurer for $299

    MEAZOR 3D Laser Measurer  for $299
  • News
    • Rumor
    • Design
    • Concept
    • WWDC
    • Security
    • BigData
  • Apps
    • Free Apps
    • OS X
    • iOS
    • iTunes
      • Music
      • Movie
      • Books
  • How to
    • OS X
      • OS X Mavericks
      • OS X Yosemite
      • Where Download OS X 10.9 Mavericks
    • iOS
      • iOS 7
      • iOS 8
      • iPhone Firmware
      • iPad Firmware
      • iPod touch
      • AppleTV Firmware
      • Where Download iOS 7 Beta
      • Jailbreak News
      • iOS 8 Beta/GM Download Links (mega links) and How to Upgrade
      • iPhone Recovery Mode
      • iPhone DFU Mode
      • How to Upgrade iOS 6 to iOS 7
      • How To Downgrade From iOS 7 Beta to iOS 6
    • Other
      • Disable Apple Remote Control
      • Pair Apple Remote Control
      • Unpair Apple Remote Control
  • Special Offers
  • Contact us

The Science and Practical Applications of Word Embeddings 

Jun 8, 2023 by iHash Leave a Comment

Word embeddings are directly responsible for many of the exponential advancements natural language technologies have made over the past couple years. They’re foundational to the functionality of popular Large Language Models like ChatGPT and other GPT iterations. These mathematical representations also have undeniable implications for textual applications of Generative AI.

From a pragmatic perspective, word embeddings are pivotal to unlocking a trove of business value from contemporary applications of Natural Language Understanding, cognitive search, and Natural Language Generation. When paired with the Generative AI capacity of some of the said language models, they drastically expedite the time required for everything from backend data management to customer-facing applications.  

According to Pega CTO Don Schuerman, the results of these practical Generative AI use cases are transformational. Moreover, they’re horizontally applicable across organizations and deployments, underpinning the basics of workflow management and application development in general.

“We can say, ‘what are the steps of a workflow to manage a loan application or onboard a new member of a healthcare plan?’” Schuerman said. “Generative AI will say, ‘here’s what the common processes for that would look like and here’s the data model for it.’”

The ability of language understanding models to properly comprehend such user requests and elicit the correct responses hinges on the efficacy of word embeddings.  

Mathematical Vectors

Word embeddings are a facet of representation learning, which provides the statistical foundation for many contemporary models for natural language technologies. According to Franz CEO Jans Aasman, these embeddings represent words as vectors. “A vector is a series of elements, usually numbers,” Aasman commented. “For example, you can have a vector of 10 numbers.” These mathematical representations include semantic understanding and, when comparing embeddings to one another in what’s oftentimes a high-dimensionality space, context. Words or phrases with similar meaning are represented closer to one another than those with dissimilar meanings are.

Aasman said representing words as numerical vectors enables machine learning models to ascribe various weights to them. In a specific text, which might include a prompt or, in the case of models like ChatGPT, the contents of the internet itself, “What you can do is take a window of like plus or minus five words or, like ChatGPT, plus or minus 500 words from the word you’re interested in,” Aasman disclosed. “You create a weight for every other word to see to what extent it influences the next word.”

The Prompt Engineering Effect

The applicability of word embeddings to prompt engineering—the means by which users phrase tasks they want generative AI models to create text in response to—is critical, because it allows models to understand what users are asking. Doing so is vital for accurate natural language technology applications of question answering, intelligent search, and more. In this context, via word embeddings, “You get a whole bunch of words around the word you’re interested in, and you can see to what extent it predicts the words after your word,” Aasman noted.

The generative AI tasks prompt engineering initiates are impressive. Some involves what Gartner has termed synthetic data which, for example, might involve “asking Generative AI to make me some sample data so I can test this application quickly,” Schuerman revealed. The same concept can easily extend to generate training data (or annotations for such data) for supervised learning models. “Every developer knows the experience of filling out a spreadsheet thinking of their best friend’s dog’s name to fill it in with different data for testing,” Schuerman observed. “That wasted time is now gone.” Users can also prompt Generative AI to write code, devise data models and fields in them, and create individual procedures for workflows or applications.

Model Restrictions

Word embeddings also affect the ability to tailor language generation models to select responses from a particular source. Because they provide the means of models understanding what users are asking for, these embeddings are amenable to prompts that focus on a particular corpus or knowledge base. “A common pattern in a GPT use case is if you want to restrict the model, or give the model a certain set of data, you can actually bake it into the prompt,” Schuerman explained. For example, if an organization wants a language model to use developer documentation for question answering, one of the first steps is to classify that text in discreet concepts, words, phrases, or sections.

According to Schuerman, GPT is useful for providing those classifications. Those components then become part of the word embedding process; it’s incumbent upon users to include those classifications in their prompts. This technique enables users to “do two things,” Schuerman specified. “It allows us to include the most up-to-date information in responses, but also ensure that we’re restricting GPT so it doesn’t go to some other source that we don’t trust to get this answer.” This same methodology can provide timely question answering for customer service documentation, IT help desks, or searching any specific corpus for conversational responses in real time.

Manifold Layout Techniques 

Oftentimes, word embeddings are vectorized in high-dimensionality spaces. Depending on the enormity of the dimensionality of a particular embedding or series of embeddings, the sheer number of dimensions may become lumbersome, slowing computations and delaying Natural Language Processing. There are several dimensionality reduction techniques involving supervised and unsupervised learning that can redress this issue.

Indico Data CTO Slater Victoroff characterized “manifold layout techniques” as one such approach to effectively bring an embedding from a higher-dimensionality space to a lower-dimensionality one. The benefit of doing so is that it largely preserves the semantics and relationships found in the former space in such a way “that you don’t lose a lot,” Aasman indicated. Manifolds are not infrequently employed in contemporary applications of word embeddings to reduce the dimensionality involved, which can spur computations and NLP results.

Today and Tomorrow 

It appears word embeddings will be part of statistical applications of natural language technologies, including textual representations of Generative AI, for some time. They assist with—if not enable—the prompt engineering process required for getting apposite, timely responses from Generative AI models. They are the conduit by which the enterprise can reap many of the advantages that this form of AI delivers for building applications, interacting with customers, and supplying rapid information retrieval.

Due in no small part to the utility of word embeddings, enterprise applications of Generative AI in low code settings are “an accelerator and starting point for any process; you name it,” Schuerman concluded. “You name the process and we can give you a starting point for it.”

About the Author

Jelani Harper is an editorial consultant servicing the information technology market. He specializes in data-driven applications focused on semantic technologies, data governance and analytics.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Source link

Share this:

  • Facebook
  • Twitter
  • Pinterest
  • LinkedIn

Filed Under: BigData

Special Offers

  • The 2024 Complete Presentation & Public Speaking Bundle for $24

    The 2024 Complete Presentation & Public Speaking Bundle for $24
  • Apple iPhone XS Max (A1921) 64GB – Gold (Grade A+ Refurbished: Wi-Fi + Unlocked) for $349

    Apple iPhone XS Max (A1921) 64GB – Gold (Grade A+ Refurbished: Wi-Fi + Unlocked)  for $349
  • Apple iPhone XR (A1984) 256GB – White (Grade A+ Refurbished: Wi-Fi + Unlocked) for $329

    Apple iPhone XR (A1984) 256GB  – White (Grade A+ Refurbished: Wi-Fi + Unlocked) for $329
  • The 2024 Google Sheets Formulas & Automation Bundle for $39

    The 2024 Google Sheets Formulas & Automation Bundle for $39
  • MEAZOR 3D Laser Measurer for $299

    MEAZOR 3D Laser Measurer  for $299

Reader Interactions

Leave a ReplyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

  • Facebook
  • GitHub
  • Instagram
  • Pinterest
  • Twitter
  • YouTube

More to See

Apple introduces the advanced new Apple Watch Series 9

Sep 24, 2023 By iHash

New Apple Zero-Days Exploited to Target Egyptian ex-MP with Predator Spyware

Sep 23, 2023 By iHash

Tags

* Apple attacks Cisco computer security cyber attacks cyber crime cyber news cybersecurity Cyber Security cyber security news cyber security news today cyber security updates cyber threats cyber updates data data breach data breaches google hacker hacker news Hackers hacking hacking news how to hack incident response information security iOS 7 iOS 8 iPhone Malware microsoft network security ransomware ransomware malware risk management security security breaches security vulnerabilities software vulnerability the hacker news Threat update video web applications

Latest

Secure your Elastic Cloud deployment with AWS PrivateLink traffic filter

Secure your Elastic Cloud deployment with AWS PrivateLink traffic filter

Traffic filters consist of rule(s) that specify the source of traffic, such as IP/CIDR or AWS VPC endpoint, and rule sets, which are a set of traffic filter rules. Rule sets are then associated with the deployment and can restrict access to the deployment based on those rules. By default, customers connect to deployment over […]

Apple expands the power of iCloud with new iCloud+ plans

September 18, 2023 UPDATE Apple expands the power of iCloud with new iCloud+ plans Beginning today, Apple users will have the option to choose from two additional iCloud+ plans: 6TB for $29.99 per month and 12TB for $59.99 per month. The new plans are a perfect complement to the powerful 48MP Main cameras on the […]

New Advanced Backdoor with Distinctive Malware Tactics

Sep 23, 2023THNCyber Espionage / Malware Cybersecurity researchers have discovered a previously undocumented advanced backdoor dubbed Deadglyph employed by a threat actor known as Stealth Falcon as part of a cyber espionage campaign. “Deadglyph’s architecture is unusual as it consists of cooperating components – one a native x64 binary, the other a .NET assembly,” ESET […]

The 2024 Complete Presentation & Public Speaking Bundle for $24

Expires September 23, 2123 07:59 PST Buy now and get 90% off The Complete Presentation & Public Speaking/Speech Course KEY FEATURES Become a master of public speaking and presentation with the complete Presentation and Public Speaking/Speech course. This course offers the most comprehensive and enjoyable training available on the market, with numerous exercises, examples, and […]

How to Interpret the 2023 MITRE ATT&CK Evaluation Results

Sep 22, 2023The Hacker NewsMITRE ATT&CK / Cybersecurity Thorough, independent tests are a vital resource for analyzing provider’s capabilities to guard against increasingly sophisticated threats to their organization. And perhaps no assessment is more widely trusted than the annual MITRE Engenuity ATT&CK Evaluation. This testing is critical for evaluating vendors because it’s virtually impossible to […]

insideBIGDATA AI News Briefs – 9/22/2023

Welcome insideBIGDATA AI News Briefs, our timely new feature bringing you the latest industry insights and perspectives surrounding the field of AI including deep learning, large language models, generative AI, and transformers. We’re working tirelessly to dig up the most timely and curious tidbits underlying the day’s most popular technologies. We know this field is […]

Jailbreak

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.2.0

Pangu has updated its jailbreak utility for iOS 9.0 to 9.0.2 with a fix for the manage storage bug and the latest version of Cydia. Change log V1.2.0 (2015-10-27) 1. Bundle latest Cydia with new Patcyh which fixed failure to open url scheme in MobileSafari 2. Fixed the bug that “preferences -> Storage&iCloud Usage -> […]

Apple Blocks Pangu Jailbreak Exploits With Release of iOS 9.1

Apple has blocked exploits used by the Pangu Jailbreak with the release of iOS 9.1. Pangu was able to jailbreak iOS 9.0 to 9.0.2; however, in Apple’s document on the security content of iOS 9.1, PanguTeam is credited with discovering two vulnerabilities that have been patched.

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.1.0

  Pangu has released an update to its jailbreak utility for iOS 9 that improves its reliability and success rate.   Change log V1.1.0 (2015-10-21) 1. Improve the success rate and reliability of jailbreak program for 64bit devices 2. Optimize backup process and improve jailbreak speed, and fix an issue that leads to fail to […]

Activator 1.9.6 Released With Support for iOS 9, 3D Touch

  Ryan Petrich has released Activator 1.9.6, an update to the centralized gesture, button, and shortcut manager, that brings support for iOS 9 and 3D Touch.

Copyright iHash.eu © 2023
We use cookies on this website. By using this site, you agree that we may store and access cookies on your device. Accept Read More
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT