What the Tech!

Welcome to What the Tech!, a space where our Creative Technologist James Pollock shares updates that have piqued his interest in the world of tech.

Katie Hubbard 14/06/2024 Katie Hubbard 14/06/2024

Stable Diffusion 3

The big news this week was that Stability.ai have released the weights for the latest iteration of their image generation model, Stable Diffusion 3.

What does this mean? Well, by releasing the weights, Stability is making their Stable Diffusion 3 model available for individuals and organisations to download and use on their own computers. Given the number of tools and extensions that have been developed to support Stable Diffusion, and the ability to “fine-tune” these models with additional training, many image generation platforms and other diffusion-based tools use Stable Diffusion under the hood.

We’ve had extensive experience with Stable Diffusion since its initial release back in August 2022, diving into the power and potential of generative AI through our R&D efforts.

Last year, with the support of Digital Catapult and NVIDIA, we pursued in-depth R&D into Stable Diffusion through the MyWorld Challenge Call. Earlier this year, we worked with digital forensics company CameraForensics to help them explore the capabilities of Stable Diffusion and develop tools to aid in the detection of generated images.

So what’s new with Stable Diffusion 3? Stability claims its new model can better interpret prompts, better reproduce text and typography, as well as overall image quality and performance improvements. Time will tell if Stable Diffusion 3 is as popular as previous versions

Due to the way models like Stable Diffusion have been trained on images scraped from the internet, there are many legal and ethical question marks over the use of it and other generative AI models in a professional commercial environment. As such, Lux Aeterna currently does not use these generative AI models in the creation and delivery of our VFX work. You can find out more about our AI use policy here.

Katie Hubbard 31/05/2024 Katie Hubbard 31/05/2024

Estimation Models

Today I’m showing how powerful estimation models can change how we work with images and video by giving us accurate approximations of aspects of image such as depth and normals.

Depth Anything

Depth Anything is a monocular depth estimation model developed by researchers at HKU, TikTok, CUHK and ZJU. This model creates a depth representation of an image based on an estimation of how near or far away objects in the image are from the camera. These kinds of depth images can be helpful in many kinds of visual effects work, like adding haze and fog to a scene, or dimensionalising archive photographs.

SwitchLight

SwitchLight takes this concept of estimating aspects of a given image and by utilising multiple models built for different estimations, creates a whole toolkit for relighting video. As you can imagine, these sorts of tools could be invaluable to compositors that often have to take things like green screen footage and make it work in a different scene.

SwitchLight: Production Ready AI Lighting (CVPR 2024)

Katie Hubbard 24/05/2024 Katie Hubbard 24/05/2024

Gaussian Splatting

Gaussian Splatting has really shaken up the volumetric capture and rendering scene over the past year or so. This technology allows for intricate, detailed volumetric captures that also maintain qualities such as reflections and other view-dependent effects. They can also be quite lightweight, making them a great option for realtime applications in web and XR.

There’s so much great work happening in this space right now, so I thought I’d share some examples:

GaussianAvatars
This research from Technical University of Munich and Toyota shows morphable, animatable gaussian head scans being driven by a video of another head or manipulated through parameters.

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

Infinite-Realities
This Ipswich-based volumetric capture studio has done some excellent work showcasing the potential of high quality motion scans.

Spatial Memories

Postshot
If you want to get started with gaussian splatting, Postshot’s a great toolkit for creating both NeRFs and gaussian splats. You provide a collection of photos or a video of a scene you want to turn into a 3D model, and Potshot trains a model on the input, giving you a live preview as it reconstructs the scene. What’s great about Postshot is you can use the software to bring your 3D captures into After Effects projects.

Postshot Beta Release v0.3

Ben Coath 17/05/2024 Ben Coath 17/05/2024

Tools for Realtime Previz

It all begins with an idea.

VCam for Unreal Engine
This is a great tool by Epic that lets you use your phone as a virtual camera. You can create your camera moves physically, while seeing the scene on your phone screen.

Velox
A really interesting approach to bringing live action footage into Unreal. Capable of extracting and placing people or objects from footage, this could be a really quick and easy way to previsualise a virtual production shot before heading to a volume.

Simulon
Simulon uses AR to place 3D assets into the real world, where you can then film the results. Uses lighting info to better embed the augmented content. Could be useful for location scouting.