Nulldata Newsletter

Share this post
Data Science 2020 - Highlights
nulldata.substack.com

Data Science 2020 - Highlights

Data Science Topics & Technologies that were significant in 2020 - A quick Review

AbdulMajedRaja
Jan 3, 2021
Comment
Share

Hey y’all!

This post is a bunch of highly-opinionated 😎 links to revisit some of the highlights of Data Science Topics & Technologies during 2020:

NLP NLP NLP

Until recently (let’s say the beginning of 2020), there was a huge focus on Computer Vision, Image & Video (media) data. But things started changing and 2020 is a strong year for NLP. Companies like Hugging Face 🤗 , spaCy, Rasa became stronger and also more educational which ultimately drove a huge NLP revolution (at even Industry-level which is quite hard usually).

Funding: Rasa (June 2020) , Hugging Face (Dec 2019)

  • Language-Agnostic Models

  • Constantly updating 🤗 `transformers` library

  • Rasa NLP for Developers by Rachael Tatman

  • Rasa Algorithm Whiteboard by Vincent D. Warmerdam

Data Science / AI / ML Web App Building

Data Scientists are bad Web Developers, but what if we need to build web apps that can talk or do Machine Learning? That’s where this domain has been getting popular & populated

  • streamlit

  • plotly jupyterdash

  • h2o wave

GPT-3

If there was one thing from the ML community that gave a lot of journalists happiness, it’s GPT-3. GPT-3 was (probably it still is) in the news almost all the time. In their own words:

In May, we introduced GPT-3—the most powerful language model to date—and soon afterward launched our first commercial product, an API to safely access artificial intelligence models using simple, natural-language prompts. We’re proud of these and other research breakthroughs by our team, all made as part of our mission to achieve general-purpose AI that is safe and reliable, and which benefits all humanity

  • GPT-3 OpenAI API

  • AI Text Generator is learning our language - fitfully

  • Awesome GPT-3 Collections

Auto ML

What if I told you Data Scientists can spend more time on critical things like **Data Cleaning** 😝 and Feature Engineering and less time on actually selecting/building the best **Model**. That’s basically the premise of Auto ML (which is how I see it).

  • Using AutoML for Time Series Forecasting

  • h2o AutoML

  • AutoGluon from Amazon

  • Auto Sklearn

  • <rumours are that 🤗 is working on an AutoNLP library>

ML Ops

Finally, Models are easy to build on Jupyter Notebooks. We all know it just takes a few lines of code and your `model.fit()` is ready. But what’s next? Thanks to the focus on the rise of ML Ops, new tools and techniques to get those Models to Prod can break the data science myth that most models die either on a PPT or a Jupyter Notebook!

  • ML Flow

  • KubeFlow

  • Github Actions for ML Ops

  • ML Ops with AWS

Other than these, there are topics/technologies like FastAI (Pytorch Library), Interpretable Machine Learning (fancily known as eXplainable AI), GANs, First-Order Motion, On-Device ML (tensorflow.js / coreML) were also trending and are worth checking out!

Overall, it was an amazing year for Data Science & Machine learning with the Hype slowing down (Oops, sorry I forgot the GPT-3 hype train that’s still running) and improved maturity

If you liked this, please ❤️ this newsletter and share it with a friend of yours!

Thanks,
A.

CommentComment
ShareShare

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

TopNew

No posts

Ready for more?

© 2022 AbdulMajedRaja
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing