Facebook Develops New AI Model that Can Anticipate Future Actions – InfoQ.com

0 Comments

Attend QCon London in-person (April 4-6, 2022) and stay ahead of the tech that matters. Register Now
Facilitating the spread of knowledge and innovation in professional software development


The Continuous Documentation methodology is a useful paradigm that helps ensure that high-quality documentation is created, maintained, and readily available. Code Walkthroughs take the reader on a “walk” — visiting at least two stations in the code — describe flows and interactions, and often incorporate code snippets.
Teams are using API mocking to break critical path dependencies and enable what were serial execution sequences into parallel paths. This article looks at where mocks should be used for the greatest impact and provides a model to estimate the effect of implementing API mocking and an API-first approach.
In the podcast, Rosaria Silipo talks about the emerging trends in deep learning, with focus on low code visual programming to help data scientists apply deep learning techniques without having to code the solution from scratch.
Empathy has emerged as a panacea to combat the anguish and suffering of the global pandemic of COVID-19 and its impact. Leading with empathy is needed. As organizations embrace a hybrid working model, they have to rethink and reimagine four critical areas: Execution, Collaboration, Communication, and Enablement. Empathy should be a core tenet of organizational culture.
There are two aspects that make cybersecurity a difficult problem. The first is that security is broad enough to permeate everything from technology to culture. The second is that while developer productivity and IT operations have improved, security has stayed relatively stagnant with the likelihood of a severe vulnerability in 2021 about the same as it was in 2016.
Learn from practitioners driving innovation and change in software. Attend in-person on April 4-6, 2022.
Your monthly guide to all the topics, technologies and techniques that every professional needs to know about. Subscribe for free.
InfoQ Homepage News Facebook Develops New AI Model that Can Anticipate Future Actions
Nov 18, 2021 1 min read
by
Daniel Dominguez
Facebook unveiled its latest machine-learning process called Anticipative Video Transformer (AVT), which is able to predict future actions by using visual interpretation. AVT works as an end-to-end attention-based model for action anticipation in videos.
The new model is based on recent breakthroughs in transformer architectures, particularly for natural language processing, and picture modeling for applications ranging from self-driving cars to augmented reality.
AVT analyzes an activity to show the potential result specially for AR and the metaverse. Facebook plans for its metaverse apps to work across other platforms and hardware, through APIs that allow programs to talk to each other.
Anticipating future activities is a difficult issue for AI since it necessitates both predicting the multimodal distribution of future activities and modelling the course of previous actions.
AVT is attention based, so it can process a full sequence in parallel, while recurrent-neural-network-based approaches often forget the past, as they need to process sequences sequentially. AVT also features loss functions that encourage the model to capture the sequential nature of video, which would otherwise be lost by attention-based architectures such as nonlocal networks. 
AVT consists of two parts: an attention-based backbone (AVT-b) that operates on frames of video and an attention-based head architecture (AVT-h) that operates on features extracted by the backbone.
The AVT-b backbone is based on the vision transformer (VIT) architecture. It splits frames into non-overlapping patches, embeds them with a feedforward network, appends a special classification token, and applies multiple layers of multihead self-attention. The head architecture takes the per-frame features and applies another transformer architecture with causal attention. This means that it evaluates features only from the current and preceding frames. This in turn allows the model to rely solely on past features when generating a representation of any individual frame.
AVT may be used as an AR action coach or as an artificial-intelligence assistant that would warn people before they commit mistakes. In addition, AVT could be helpful for tasks beyond anticipation, such as self-supervised learning, the discovery of action schemas and boundaries, and even for general action recognition in tasks that require modeling the chronological sequence of actions.

Uncover emerging trends and practices from domain experts. Attend in-person QCon London (April 4-6, 2022).
A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.
You need to Register an InfoQ account or or login to post comments. But there’s so much more behind being registered.
Get the most out of the InfoQ experience.
Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.
Uncover emerging trends and practices from the world’s most innovative software professionals. QCon London is a conference for senior software engineers, architects and team leads.
Deep-dive with world-class software leaders on the patterns, practices, and use cases leveraged by the world’s most innovative software professionals.
InfoQ.com and all content copyright © 2006-2021 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we’ve ever worked with.
Privacy Notice, Terms And Conditions, Cookie Policy

source

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Related Posts