FOSS Machine Learning News week 21-2020

Welcome to our biweekly selection of Free and Open machine learning news. Created using our own opinionated selection and summary algorithm. FOSS machine learning is crucial for everyone. Machine Learning is a complex technology. So keep it simple.

1 Three Risks in Building Machine Learning Systems

Machine learning (ML) systems promise disruptive capabilities in multiple industries. Building ML systems can be complicated and challenging, however, especially since best practices in the nascent field of AI engineering are still coalescing. Behind the hype, there are three essential risks to analyze when building an ML system: 1) poor problem solution alignment, 2) excessive time or monetary cost, and 3) unexpected behavior once deployed.

(Software Engineering Institute)

2 Request for comment: how to collaboratively make trustworthy AI a reality

And, what do we want to do about it? How do we collaboratively make trustworthy AI a reality? Today, we’re kicking off a request for comment on v0.9 of Mozilla’s Trustworthy AI Whitepaper — and on the accompanying theory of change diagram that outlines the things we think need to happen. At a baseline, this means collecting information about educational programs, technology building blocks, product prototypes, consumer campaigns and emerging government policies that focus on making trustworthy AI a reality.

(Mozilla BLOG)

3 Bacatá: Notebooks for DSLs, Almost for Free

This article presents Bacatá, a mechanism for generating notebook interfaces for DSLs. DSLs can be powerfull to simplify creating ML software and tools. Since notebooks are a default tool for ML engineers, so this tool can be in theory fit in the toolbox of ML engineers. Bacatá is a language-parametric kernel generator that hides the complexity of Jupyter’s low-level wireprotocol. Thus, creating a language kernel for a DSL becomes a matter of writing afew lines of code. Speeding up development you may hit the ‘DSL’ option. This project aimes at creating better and faster DSLs. The project is created by Dutch researchers. And is build upon the great Rascal project, also created in the Netherlands by the famous CWI institute.

(Bacata)

4 Mathematics for Machine Learning

Great open text book for learning the mathematics for machine learning. This document is an attempt to provide a summary of the mathematical background needed for an introductory class in machine learning, which at UC Berkeley is known as CS 189/289A. Now also included in the FOSS ML Guide.

(link)

5 Playing Atari with Six Neurons

Playing Atari with Six Neurons Cuccu et al. A great paper for learning more on Deep reinforcement learning. The paper presents a method to address complex learning tasks such as learning to play Atari games by decoupling policy learning from feature construction. This paper will give you some hours of time, but taking the time is good and healthy brain food if you are into Deep Reinforcement learning.

(link)

6 Open-Sourcing Bit: Exploring Large-Scale Pre-Training for Computer Vision

In particular, we highlight the importance of appropriately choosing normalization layers and scaling the architecture capacity as the amount of pre-training data increases. Once the BiT model is pre-trained, it can be fine-tuned on any task, even if only few labelled examples are available. In a similar spirit to how BERT and T5 have shown advances in the language domain, we believe that large- scale pre-training can advance the performance of computer vision models. We use BiT as a backbone for RetinaNet on the MSCOCO-2017 detection task and confirm that even for such a structured output task, using large-scale pre-training helps considerably.

(googleblog.com)

7 A Visual Survey of Data Augmentation in NLP

Augmentation of text data in NLP is pretty rare. This article is a short visual journey on what we know. NLP is not simple…

(link)

8 Marketplace for AI Models

Articial intelligence shows promise for solving many practical societal problems in areas such as healthcare and transportation.However, the current mechanisms for AI model diffusion such as Github code repositories, academic project webpages, and commercial AI marketplaces have some limitations. This paper outlines principles for a marketplace for AI models based on a decentralized online structure. Nice non technical AI research paper.

(link)

9 scikit-learn 0.23.0 is available for download

Scikit-learn is a Python module for machine learning. Scikit-learn is a Python module for machine learning. Long-awaited Generalized Linear Models with non-normal loss functions are now available, estimators can now be visualized in notebooks and more new features.

(link)

Read and share the FOSS ML Guide

The FOSS Machine Learning News Blog is a brief overview of open machine learning news from all over the world. Free and Open machine learning means that everyone must be able to develop, test and play and deploy machine learning solutions. Read and share the FOSS ML Guide! And remember:You are invited to join the Free and Open Machine Learning open collaboration project.