The PyTorch3D Framework

Overview

PyTorch is one of the leading deep learning frameworks for research-grade and production-grade projects. Its ease of use, Pythonic syntax, and extensibility make it a favorite among machine learning practitioners for 3D machine learning (3D ML) research.

For those with no experience in PyTorch, it should be easy to follow along, given an understanding of the basics of machine learning. In addition, we’ll be using PyTorch3D, a library built atop PyTorch specifically to handle 3D deep learning data.

What is PyTorch3D?

PyTorch3D is an API built atop PyTorch with GPU-optimized implementations of common components used in 3D deep learning and computer vision. It includes efficient heterogeneous batching operators, a differentiable mesh renderer, common loss functions, I/O support for common 3D formats such as OBJ, OFF, PLY, and glTF, and even early support for implicit representations for novel view synthesis like Neural Radiance Fields (NeRFs). We can solve many problems with it, including mesh deformation, bundle adjustment, learning mesh textures from images, fitting NeRFs, and more.

Overview of the PyTorch3D API

The API covers all of the stages of the 3D machine learning lifecycle, including dataset loadersHelper classes to support iteration over a dataset., common operations, renderersFunctions that convert 3D data into images., and loss functions. Each of the major modules is introduced below.

Press + to interact

The `ops` module

This module includes many operations that are useful to construct 3D ML model architectures, such as graph convolution with graph_conv, perspective_n_points (PnP), iterative_closest_point (ICP), and many more.

Note: Graph convolution is a variation on convolution operation that is applied to graph data. The Perspective-n-Point algorithm is a means of estimating the pose of a camera given a list of 3D points with corresponding 2D pixel coordinates. Iterative closest point is a technique for aligning a pair of point clouds.

Press + to interact

The `renderer` module

The renderer module is where much of the magic happens in 3D ML. It provides a set of GPU-optimized differentiable rendering classes that we’ll often use to produce 2D renders of our 3D data. This includes shading, texturing, lighting, cameras, and rasterizers for meshes and point cloud data. By making these renderers differentiable, it allows us to propagate gradients all the way from the output renders down to the components of our scene, such as object position and rotation, texture colors, lighting values, and more.

Press + to interact

The `structures` module

Meshes and point clouds are a handful of data structures used to represent different kinds of 3D data. Meshes are often useful for authored 3D models. Point clouds are often generated by sensors that generate 3D data, such as LiDAR or photogrammetry. The structures module includes the Meshes and Pointclouds classes used to represent our 3D data. It is also where the bulk of the batching logic is contained and helper functions to convert between the various types of heterogeneous batching.

Press + to interact

Getting Started

Cameras and Projection

Rendering

Data Representations

Operations and Techniques

Key Models

Final Assessment

Conclusion

Appendix

Link-Based Classification Using Graph Neural Networks

Overview

What is PyTorch3D?

Overview of the PyTorch3D API

The `datasets` module

The `io` module

The `loss` module

The `ops` module

The `renderer` module

The `structures` module

The `transforms` module

Getting Started

Cameras and Projection

Rendering

Data Representations

Operations and Techniques

Key Models

Final Assessment

Conclusion

Appendix

Link-Based Classification Using Graph Neural Networks

The PyTorch3D Framework

Overview

What is PyTorch3D?

Overview of the PyTorch3D API

The datasets module

The io module

The loss module

The ops module

The renderer module

The structures module

The transforms module

The `datasets` module

The `io` module

The `loss` module

The `ops` module

The `renderer` module

The `structures` module

The `transforms` module