Image for post
Image for post
Photo by Arseny Togulev on Unsplash

THE LANGUAGE MODELING PATH: CHAPTER 3

A dive inside the model who gave birth to BERT and GPT3

Hi and welcome to the third step of the “Language Modeling Path”, a series of articles from Machine Learning Reply aimed to cover the most important milestones of the huge language models able to imitate and (let’s say) understand the human language like BERT and GPT-3.

In order to fully appreciate the details of the model explanations you will need some knowledge of Attention mechanism and encoder-decoder model archetype. If you would like to get the basics of these topics before engaging this article we would suggest you to take the previous steps of the Language Modeling Path:


Image for post
Image for post

THE LANGUAGE MODELING PATH: CHAPTER 2

The mechanism that makes language models focus on the important things

Hi and welcome to the second step of the Language Modeling Path; a series of articles from Machine Learning Reply aimed to cover the most important milestones that brought to life the huge language models able to imitate and (let’s say) understand the human language like BERT and GPT-3.

In this article we are going to talk about attention layers. This kind of architectural trick was first applied to the computer vision field [1] but in this article we will focus only on the Neural Natural Language Processing application and in particular on the sequence to sequence applications for Neural…


Image for post
Image for post
Photo by Patrick Fore on Unsplash

The Language Modeling Path: Chapter 1

A turning point in automatic translations

This article is part of the “Language Modeling Path”. A series of article trying to collect main advances in Neural NLP that lead to current SOTA Generative Models. If you will enjoy this one go check the other ones too!

Prerequisite:
To be able to fully appreciate this article is strongly suggested a previous experience with artificial neural networks basics and Recurrent Neural Networks.

Introduction

Our story starts in December 2014 with the publication of a paper named “Sequence to Sequence Learning with Neural Networks” by Google [1]. The researchers of this papers were led by a simple but crucial statement.


Image for post
Image for post
Photo by Patrick Fore on Unsplash

A series of articles for and by deep learning and NLP enthusiasts

How the machines learned human languange

We are living an exciting time in the field of Natural Language Processing (NLP) and Deep Learning. Neural network language models are spreading like never before and the top players like Google and Open AI are investing huge efforts in the development of ever larger and more powerful models. The ultimate goal you are asking? Capture the essence of the human language, whatever this could mean. Semantic, syntactic, grammar, entities etc all this kind of information and patterns can be extracted from text using NLP and exploited to perform most of the AI magic we always see generating huge hype…


Image for post
Image for post
https://commons.wikimedia.org/wiki/File:Muse_2017.jpg

The NLP model that speaks 16 languages

Hi,

As many of you may have experienced, one of the cards that the real world often plays to put a spanner in the works of a data scientist struggling with Natural Language Processing is Languages. Have you just finished training your awesome chat-bot on thousands of English FAQ? Well, it would be a shame if business asked to deploy it on an Italian company branch…

Well, here it is something that could ease your multilingual problems.

Last July Google AI labs released the Multilingual Universal Sentence Encoder. This is a sentence encoding…

Davide Salvaggio

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store