Average
Above Average
100

Who was the systems lead for Sora?

Connor Holmes

100

Who was the executive producer of Sora?

Aditya Ramesh.

200

What was the background picture of the credits slide?

Tons of painting and artwork on walls.

200
What was the background of the first slide?

A woman walking through a Tokyo street.

300

What fears do people have about Sora?

That it will replace animation workers, content creators, and go rogue.

300

What is Sora. Answer in near exact words to those on slide 2.

Sora is a text-to-video model that can generate videos up to a minute long whilst maintaining lifelike quality and adherence to the user’s prompt.

400

What is fidelity viewing?

Fidelity viewing is a higher quality of viewing with ray-tracing at such with a reduced fps typically at 15-30.

400
In near exact words, what was the prompt for the video on slide 2?

Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.

500

Name 3 ways that Sora is researched.

They specifically train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. They also leverage a transformer architecture that operates on spacetime patches of video and image latent codes. Sora, their largest model, is capable of generating a minute of high fidelity video. Their results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world. Any 3 of these.

500

Name ALL of the methods that Sora is researched.

They specifically train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. They also leverage a transformer architecture that operates on spacetime patches of video and image latent codes. Sora, their largest model, is capable of generating a minute of high fidelity video. Their results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world.

M
e
n
u