Thursday, 12 January 2023

“Copenhagen Cowboy:” Neon, Noir, and Nefarious

NAB

article here

 

Danish director Nicolas Winding Refn says that the characters in his new Netflix series, Copenhagen Cowboy, are a “female evolution” of characters from previous projects such as DriveValhalla Rising and Only God Forgives.

So, that could only mean one thing: stylized ultra-violence.

“I’ve done films in the past with a certain type of character that was first played by Mads Mikkelsen in Valhalla Rising on one hand and then Ryan Gosling played him as a driver in Drive and then Vithaya [Pansringarm] played him as a lieutenant in Only God Forgives,” Refn explained during the Venice Film Festival premiere of the series, as reported by Diana Lodderhose at Deadline.

Copenhagen Cowboy is his take on a superhero show. He continued, “So, I was working with Robert Wade and Neal Purvis, on a larger female evolution of that character and then suddenly one night, I was like, ‘Maybe I should try to do a version of it as female and not just one but many.’ And that was the kind of aspiration to do it.”

Dubbed by critics as a neon-noir or acid western, though described by the Danish filmmaker as “poetic neo-noir,” the series, which launched on January 5, revolves around a young heroine called Miu (Angela Bundalovic) on a search for justice after a lifetime of servitude.

“I think that the [superhero] genre, like fairy tales… it’s a reflection of us as a society and it mirrors our desires and it’s our fantasies and it’s everything that’s really interesting because it’s heightened reality,” Refn noted.

It’s not for everyone.

“[Y]our enjoyment of Copenhagen Cowboy will go as far as you can tolerate Refn’s visual aesthetic,” Sean Price writes in his review for The Spool. “The primary colors that paint the entire frame with a neon glow, the pulsing Cliff Martinez score, and of course, the Miami Vice font.”

But even Price acknowledges the show’s vibe is not without its virtues.

“There may not be substance behind the style, but it goes a long way when your style includes Cliff Martinez,” whose score he says “does most of the emotional heavy lifting” for Copenhagen Cowboy.

Fleeing Hollywood for Freedom

The show is shot in Denmark, is produced by his wife Liv Corfixen, and also features his daughters Lola and Lizzielou Corfixen. It’s also a product of the pandemic and streaming’s content creation boom.

Refn told Deadline‘s Crew Call podcast that he pitched the idea to the newly formed Netflix Nordic when he “really didn’t know how the world was going to turn out.” The Netflix Nordic was sold on the idea of a new narrative borne out The Pusher trilogy, and after a five-month stint with an all female writing team, Copenhagen Cowboy was brought to set.

“We had a great crew and, in a way, there is something very easy about working in the Scandinavian model because we are not so many people and I like that kind of smaller components of productions and so forth. It was just very pleasant.”

In an interview with Anthony D’Alessandro for Deadline, Refn was asked if he ever considered making a mainstream superhero project.

“I’ve always cherished my independence,” he replied. “I think waking up in the morning and going to work and paint the way you want it to look and go home, is still the most satisfying experience ever.

“If you don’t have the power of control at the end of the day or the ability to manipulate into your favor, it is committee. You have to spend your entire day struggling to get a compromise across, then what example am I to my own kids?”

He also added that he thought the studio system is not in good shape, commenting, “Hollywood is very seductive and intoxicating, but it’s also a system that’s falling apart desperately,” Refn said. “And I think they’re doing it to themselves more than anything else.”

Making a Netflix Hit and Surviving for Season 2

A cynic might wonder if Refn’s analysis of Hollywood is in some way influenced by studios that are less apt to write him a blank check for a niche production. The Ringer’s Miles Surrey describes his previous lavish budgets as “a blank check that came out of nowhere and wasn’t necessarily earned.”

Surrey writes, “Whether or not Refn moved to the small screen because he was no longer finding any takers for his feature films, his divisive style is an intriguing fit for the stretched-out length of a TV show.”

But Refn’s Netflix Nordic endeavor seems a bit more right-sized to Surrey. He writes, “Copenhagen Cowboy should be more accessible—and presumably far cheaper to produce—than Refn’s grand Amazon experiment” (meaning his 13-hour Too Old to Die Young).

None of these comments mean that Surrey is panning the show, however. “This is as challenging as television can get, and while it won’t be everyone’s cup of tea, there’s no denying that Refn is utterly singular in his image making. To appreciate a Refn project like Copenhagen Cowboy is to accept that, sometimes, style wins out over substance.”

But even Netflix and its competitors are getting more ruthless, and it will be interesting to see if the streamer deems the show a success and, therefore, worthy of a second season.

“[T]he way the story leaves off, it’s clear these six chapters have been planned as part of a multi-season arc, should the Netflix gods be feeling generous,” notes The Hollywood Reporter’s Angie Han.

Collider’s Chase Hutchinson is a fan of the unconventional series, who never the less isn’t holding out a lot of hope for continuing Miu’s storyline: “[T]he series is, to be frank, rather unlikely to find the broadest of audiences which is crucial in a ruthless streaming world ruled increasingly by metrics. Still, no matter what happens in the future, the mere presence of such a show is worth celebrating.”

For Refn’s part, it’s clear he hasn’t gone all-in on a streaming-centric model. He told reporters, “I don’t think theatrical [cinema] will ever go away. I think theatrical will always exist, but it needs to be challenged in order to become better, more sufficient and more meaningful.”

 


“Aftersun:” How Do You Remake Memories?

NAB

Sight and Sound, the prestigious international film magazine, selected Charlotte Wells’ debut feature Aftersun as the Best Film of the 2022.

article here

Inspired by, but not based on, the director’s experiences as the child of young parents, the ’90s-set film stars newcomer Francesca Corio as Sophie, an 11-year-old girl on a package holiday to Turkey with her father Calum (Paul Mescal).

The film, which also won 7 British Indie Film Awards has is described by the magazine, as an “exquisitely subtle yet deeply affecting and honest depiction of mental illness, father-daughter love, and memory.”

Developed and produced with the support of the BFI Film Fund, using funds from the National Lottery, Aftersun was one of the most talked about films at this year’s Cannes Film Festival and was picked up for international distribution by A24. 

IndieWire’s Eric Kohn, judged it “the most evocative look at an adolescent gaze coming to terms with the adult world since ‘Moonlight.’

Several critics compare the way Aftersun paints its characters’ interior lives to that of Moonlight director Barry Jenkins. Not coincidentally perhaps Jenkins and his producing partner Adele Romanski served as producers on the film.

The 35-year old was born and raised in Edinburgh, but moved to the US in 2012 to study film at New York University. There, her standout short films including Laps and Blue Christmas caught the attention of Romanski who encouraged Wells to develop the script.

“Her short films were pretty fucking brilliant,” Romanski tells Kohn. “I was curious to hear what she was working on and how the storytelling style for her shorts would translate into that longer format. Then we waited patiently for years.”

That was in 2018. Wells finally retreated into a two-week writing frenzy in 2019, but held onto her first draft for another half a year before sending to Romanski. “I spent six months pretending to rewrite but in actual fact just spellchecking it over and over again,” she said.

Her film is very much about memory — how certain moments stay with us forever, but also how our interpretation of events can differ from what actually happened. The story’s “beautiful elusiveness — its accumulation of seemingly inconsequential fragments that gradually accrue in emotional power” per Tom Grierson in the LA Times — makes it a difficult movie to encapsulate, even for its maker.

Deadline’s Damon Wise isn’t the only interviewer to observe Wells appearing “somewhat shell-shocked by her film’s progress in the world” adding “I’m actually a little in awe of the fact that this film has — and could — reach so many people.”

That’s perhaps because, as she tells Slant Magazine, “Mental health struggles are messy, symptoms overlap and diagnoses are often [incorrect]. It’s incredibly difficult to pinpoint many mental illnesses,” she tells Marshall Shaffer of Slant Magazine.

Of the film’s deliberate ambiguity Wells says to Alex Denney of AnotherMag, “I think inherent in whatever style it is that I have there is space for people to bring their own experiences. It’s both conscious and not: I think when you avoid a certain kind of exposition it does create ambiguity and people will fill that ambiguity with their own experiences, their own reference points that they enter the cinema with.”

Withholding information “is kind of the point of the film” she tells IndieWire. “I think the ambiguity is inherent in the subtlety and my aversion to exposition. But for me, the answers are all in the film.”

Her reticence to talk in concrete terms about her work is also warning not to label it an autobiography. “It’s very much fiction, but rooted in experience and memory,” she reveals to Denney. “It’s personal in that the feeling is mine and I allowed my own memories and anecdotes through all of childhood to form the kind of skeleton outline [of the first draft]. But after that point it did become very much about the story I was trying to tell, and that frequently required pushing it away from my own experience.”

Cinematographer Gregory Oke records on lush 35mm and part-masks Calum’s appearance through the film, rendering him a semi-ghostly presence.

“We worked hard to keep Calum at arm’s length, to keep more physical distance between him and the camera in order to create the feeling that he is in some sense unknowable,” Wells tells Denney.

Intersperse throughout the narrative is a jarring dreamlike rave sequence, which finds the adult Sophie confronting her father under strobe lights on the crowded dancefloor.

“In a lot of ways, there was a mystery to the process of discovering exactly what this was,” Wells explains to IndieWire. “So much of the process found its way into the film. The process of rooting through the past and memories and allowing some to rise to the surface, transforming them or reframing them.”

Noting Aftersun’s impressionistic style, Deadline wonders whether Wells achieved that by taking things away in the edit, or scripting it.

“Both,” is her reply. “I didn’t shoot anything I didn’t want to be in the film.   But there is plenty that isn’t in the final cut, that was lost in service of the edit. There were discoveries in the edit that were originally just strategies that we used to solve problems but which ended up being quite a meaningful strategy in terms of creating a sense of memory.”

The way Aftersun deceptively drifts from scene to scene — punctuated by meditative cutaways of shots like a person’s hand or a random passerby yelling at their child kid — are all painstakingly crafted.

“Some of [those shots] were whole scenes reduced to an image,” Wells tells IndieWire. “Some were details in the script, and some were discovered on set based on months, if not years, of conversations with my cinematographer.”

When it’s suggested the deft execution of Aftersun feels like a magic trick, she demurs. “I don’t have an answer as to what it is,” she says. “We didn’t set out to pull off an emotional heist.”


Wednesday, 11 January 2023

Tech Resolutions for 2023: Cement Digital Trust With Blockchain

NAB

The blockchain is becoming key not only to developing and monetizing digital assets but also to creating digital trust so profound it could be an antidote to our collective diminishing faith in government, media, money, businesses, and other civic and private institutions.

article here

This is a macro tech trend identified by Deloitte in its major end-of-year report, highlighting new technologies and approaches that stand to become the norm within the next 18 to 24 months, and projects where these trends could be headed next during the coming decade.

The global shift of computing to the cloud and to the edge has not only decentralized the systems of the internet but given rise to technologies and platforms rooted in the cryptographically secure blockchain. As organizations begin to understand blockchain’s utility, they’re realizing that building stakeholder trust could be one of its primary benefits.

“Digital ledger technologies and decentralized business models that achieve consensus through code, cryptography, and technology protocols are demonstrating that none of us is as trustworthy as all of us,” explains Mike Bechtel, Deloitte’s Chief futurist. “In this world, digital natives are increasingly likely to demand higher-quality proof and higher order truth. Indeed, we anticipate tomorrow’s leaders to assert ‘chain or it didn’t happen.’ ”

This, of course, is the world of Web3, the exponents of which essay a future “in which the loudest voices can’t overshadow a single, immutable version of the truth, based on public blockchains,” according to Deloitte.

Organizations of all stripes may be able to cement their credibility, the consultancy suggests, by helping “reinvent” a decentralized internet because in our current environment of ever-increasing mistrust, blockchain and Web3 could power “trustless” systems that decentralize data to rebuild trust.

Deloitte elaborates, arguing that digital trust issues today undermine confidence in traditional institutions and the technology that powers them.

Yet decentralized systems, applications, and business models add “a protective layer,” enabling organizations to close the digital trust gap by helping them create “a single version of irrefutable truth.” Such systems rely on cryptography- and code-driven consensus of systemwide users, rather than moderation by third-party intermediaries — without sacrificing data privacy.

The resulting shared, trusted record can be inspected by selected third parties but cannot be controlled by any single, central superuser.

Further benefits: A consortium of participants keeps the information up to date so that each participant maintains a copy of the updated, immutable database. People can securely store, share, and control their own tamper-proof credentials (such as personal health, education, voting records, etc.) in an encrypted digital wallet. Proof of identification stored in encrypted digital wallets could lead to more secure transactions.

What’s more, Deloitte claims, organizations can break down data silos to collaborate with external partners, unknown or untrusted parties, and competitors, without compromising privacy, confidentiality, security, or intellectual property.

This directly impacts media by validating something as genuine.

“In an era of deepfakes, AI-generated imagery, and alternative facts, seeing something with your own two eyes is not necessarily sufficient proof of the truth,” Deloitte notes.

“But if an entire community sees it on a public blockchain? Trustless, decentralized platforms could become an arbiter of truth: Chain or it didn’t happen.”

By changing how content is made, managed, protected, and monetized, the report continues, “Web3 could rescue us from its Web2’s obsession with clicks and likes. A disintermediated web has the potential to transfer power from intermediaries to producers and consumers.”

Creators gain too. In Web2, “digital” is synonymous with “abundant.” Nearly all digital content is infinitely shareable, legally or not. The infinite supply of content drives demand (prices and consumer attention) toward zero.

Web3 changes that. By introducing the notion of “digital scarcity,” Web3 architectures offer creators an opportunity to reassert some ownership and control of their content, data, profiles, and identities, with the ability to manage and monetize them across multiple websites and platforms rather than creating multiple copies.

Creators could lock access to a song, video, or other intellectual property so it’s only accessible via smart contract and programmable money, with the potential for revenue to be shared in real time.

With consumers in charge of their own buying and browsing data, blockchain could significantly disrupt digital advertising, too. In addition to giving consumers control over their data and who uses it — in itself a massive disruption — it could also help eliminate advertising fraud caused by internet bots and domain spoofing, which one research firm estimated as costing global advertisers US$68 billion by the end of 2022.

“Amid a crisis of faith in which seeing isn’t believing, and people can’t tell the truth from a lie, many of us have been waiting on a superhero: a person, company, or technology that might somehow serve as an unimpeachable arbitrator to help us settle quarrels and distinguish fact from fiction.”

Decentralized, trustless architectures are beginning to teach us that we are the heroes we’ve been looking for; and that none of us, in fact, is as trustworthy as all of us.

 


Tech Resolutions for 2023: Learn to Trust Our AI Colleagues

NAB

We spent the last 10 years trying to get machines to understand us better. It looks like the next decade might be more about innovations that help us understand machines, Deloitte predicts in its end-of-year Future Trends report.

article here

Few business leaders doubt AI’s abilities to contribute to the team, and Deloitte says there’s plenty of evidence suggesting businesses that use AI pervasively throughout their operations perform at a higher level than those that don’t. But there’s a trust issue when implementing AI into the workforce. Specifically, enterprises have a hard time trusting AI with mission-critical tasks.

In short, if humans don’t trust machines or think they’re making the right call, it won’t be used.

“With AI tools increasingly standardized and commoditized, few businesses may realize true competitive gains from crafting a better algorithm,” the report states. “Instead, what will likely differentiate the truly AI-fueled enterprise from its competition will be how robustly it uses AI throughout its processes. The key element here, which has developed much slower than machine learning technology, is trust.”

Deloitte elaborates the argument. Computers were once seen as more or less infallible machines whose calculations were never wrong that simply processed discrete inputs into discrete outputs.

As algorithms increasingly shoulder probabilistic tasks such as object detection, speech recognition, and image and text generation, the real impact of AI applications may depend on how much their human colleagues understand and agree with what they’re doing.

“What may matter in the future is not who can craft the best algorithm, but rather who can use AI most effectively.”

In that case, developing processes that leverage AI in transparent and explainable ways will be key to spurring adoption.

One of the biggest clouds hanging over AI today is its black-box problem. Because of how certain algorithms train, it can be very difficult, if not impossible, to understand how they arrive at a recommendation.

“Asking workers to do something simply because the great and powerful algorithm behind the curtain says to is likely to lead to low levels of buy-in.”

How does this lack of trust manifest itself in the creative industries and its increasing use of generative AI tools like OpenAI’s DALL-E 2 image generator and GPT-3 text generator.

“In many cases, generative AI is proving itself in areas that were once thought to be automation-proof,” says Deloitte. “Even poets, painters, and priests are finding no job will be untouched by machines.”

That does not mean, however, that these jobs are going away, the report insists. “Even the most sophisticated AI applications today can’t match humans when it comes to purely creative tasks such as conceptualization, and we’re still a long way off from AI tools that can unseat humans in jobs in these areas.”

The prevailing approach to bringing in new AI tools is to position them as assistants, not competitors.

“Workers and companies that learn to team with AI and leverage the unique strengths of both AI and humans may find that we’re all better together,” says Deloitte. Think about the creative, connective capabilities of the human mind combined with AI’s talent for production work. We’re seeing this approach come to life in the emerging role of the prompt engineer.”

As enterprises consider adopting these capabilities, they could benefit from thinking about how users will interact with them and how that will impact trust.

“Think of deploying AI like onboarding a new team member,” the consultancy advises. “We know generally what makes for effective teams: openness, rapport, the ability to have honest discussions, and a willingness to accept feedback to improve performance. Implementing AI with this framework in mind may help the team view AI as a trusted copilot critic.”

For some businesses, the functionality offered by emerging AI tools could be game-changing. But a lack of trust could ultimately derail these ambitions.

Deloitte also addresses the longer term future of AI, which it characterizes as “exponential intelligence.”

“Affective AI — empathic emotional intelligence — will result in machines with personality and charm,” says Mike Bechtel, Deloitte’s chief futurist. “We’ll eventually be able to train mechanical minds with uniquely human data — the smile on a face, the twinkle in an eye, the pause in a voice — and teach them to discern and emulate human emotions. Or consider generative AI: creative intelligence that can write poetry, paint a picture, or score a soundtrack.”

After that, we may see the rise of general-purpose AI: intelligence that has evolved from simple math to polymath. Today’s AI is capable of single-tasking, good at playing chess or driving cars but unable to do both. General-purpose AI stands to deliver versatile systems that can learn and imitate a collection of previously uniquely human traits.


Tech Resolutions for 2023: An Immersive Internet for the Enterprise

NAB

Our mediation with the internet is morphing from rectangular glass screens to something more immersive — and invasive. CES 2023 was a great place to explore the latest ideas in visual and vocal web interfaces, which are needed, says consultant Deloitte, for companies to build business models activated in the metaverse.

article here

In its 2023 Future Trends report, Deloitte says business leaders should consider the metaverse not as a diminished proxy for in-person experiences but instead as an enriched alternative to email, text chat, and heads in square boxes. In other words, the metaverse is best thought of as a more immersive incarnation of the internet itself: “internet plus,” as opposed to “reality minus.”

Simultaneously, technology interaction is poised to progress from separate digital realities toward ambient computing, where users can look up from their devices at a world that synchronizes effortlessly with technology.

Deloitte asks us to consider the metaverse use case that has defined the market up to now: gaming. The entire digital gaming industry is expected to surpass $220 billion in revenue in 2023, more than streaming video, digital music, and e-books combined.

Specifically, the online gaming industry is poised to exceed $26 billion next year, boasting an audience of 1.1 billion gamers.

Crucially, these gamers often gather online not just for gameplay but for the social and commercial possibilities offered by the immersive internet.

A striking 82% of those attending live in-game events also made a purchase because of the event, either in the form of digital goods or physical merchandise.

Whether through gaming or other means, 25% of consumers could be spending at least one hour in the metaverse each day by 2026, while 30% of businesses are estimated to have products and services ready.

Such figures are “emphatic proof” for Deloitte that the economy of the immersive internet mirrors the physical world.

“Brands can charge a premium for providing a unique experience or signaling value to other consumers,” it states, before urging brands across industries to invest now “to meet today’s customers where they already are.”

Potential developments in internet interaction over the next decade include sensory expansion. Deloitte asks us to consider the possibility of one day “smelling a cake baking in the metaverse or, if you’re willing to lick a screen, tasting it.”

Startups such as OVR Technology are developing scent packs to connect to VR headsets, while others such as HaptX are building haptic gloves to deliver a sense of touch.

AR tools such as smart glasses and motion sensors can enable spatial interaction, allowing users to interact directly with physical data without creating a digital copy. For example, patrons can walk up to a restaurant wearing smart glasses and be treated to a display of hours, current promotions, and reviews. Or, by suppressing images in their glasses, a group of friends can attend a concert without seeing any of the city billboards in view.

The spatial web is likely to blur the lines between physical and virtual environments. Mike Bechtel, Deloitte’s chief futurist, says, “As reality itself increasingly comes online, digital content will be seamlessly woven into our physical spaces, inseparable from our shared personal and professional experiences.”

The next generation of devices may connect users to the metaverse without requiring additional headsets or handheld devices. Deloitte invites us to imagine stepping into a media room that displays the metaverse as a hologram across the walls.

“Or imagine a laptop that uses cameras to translate an employee’s real-life gestures into an avatar’s movement in the virtual workplace.”

These areambient experiences, Bechtel explains, in which ubiquitous digital assistants monitor the environment, awaiting a voice, gesture, or glance, reacting to (or proactively anticipating) and fulfilling our requests.

What about mind control? Brain-computer interfaces (BCIs) represent an extreme in simplifying user interactions with technology, but noninvasive BCI technology is already finding its way into AR/VR headsets. Today’s smart thermostats accept voice control; tomorrow’s will know you feel chilly and proactively adjust to ensure your comfort.

Neural interfaces that afford direct communication between biological thought and digital response “should eventually allow users to control digital avatars and environments using thoughts.”


Tech Resolutions for 2023: Taming the Multicloud Chaos

NAB

Businesses are using services from multiple cloud platforms and managing it has become a significant problem. A solution could be a metacloud — one cloud to rule over them all — but as you can imagine, this has problems too.

article here

In its latest Future Trends report, consultancy Deloitte explains that the vast majority of enterprises are using multiple platform-as-a-service tools, and as many as 85% are using two or more cloud platforms. A quarter are using at least five cloud platforms.

It has, in the words of the report, “created a tangled web of cloud tools that are sometimes interconnected but just as often redundant and create holes in security.” According to the analysts, this situation is unlikely to change anytime soon. Solution teams want to use what they perceive to be the best tool for the job, regardless of which cloud it’s in. Nor do not want to be subject to the availability of tools within a single vendor’s walled garden. Yet they can end up paying for cloud services they don’t use.

To simplify multicloud management, enterprises are beginning to turn to a layer of abstraction and automation that offers a single pane of control.

Known alternately as metacloud or supercloud, this family of tools and techniques can help cut through the complexity of multicloud environments by providing access to common services such as storage and computation, AI, data, security, operations, governance, and application development and deployment.

“Metacloud offers a single pane of control for organizations feeling overwhelmed by multicloud complexity,” says Deloitte.

This layer sits above an organization’s various cloud platforms, leveraging native technical standards through APIs. The idea is that applications still enjoy the strong security of the cloud provider, but in a consistent manner with centralized control.

While this makes common sense on purely technical grounds, the problem is whether the market will support it

While a compatibility layer has clear benefits for users, it naturally leads to the commoditization of the cloud providers (Amazon, Google, Microsoft) which may not be in their interests.

“History suggests, however, that metacloud may only be an interim solution,” say Deloitte. Past efforts to reign in sprawling data centers, databases, and operating systems have ultimately resulted in consolidation, centralization, standardization, and rationalization — not via middleware or orchestration engines, but with refactoring and simplicity.”

What could end up taking the place of metacloud is “a more tactical approach,” Deloitte suggests, “one that borrows the centralization and control of metacloud but leaves in place the freedom developers currently have to choose the right tool for the job.

This tactical metacloud could govern provisioning of cloud credentials and allocate resources only to users that have a valid business case and the technical knowhow to make use of cloud resources without creating complexities.

Multicloud may feel messy, but it’s the world we’re living in, and likely will be for the foreseeable future,” the report warns. “Smart business and technology leaders should look for areas to reduce complexity wherever possible — potentially through approaches like metacloud — and eliminate security and redundancy problems created by maintaining multiple cloud instances.”


Tuesday, 10 January 2023

How Diffusion Drives Generative AI

NAB

Text-to-image AI exploded last year as technical advances greatly enhanced the fidelity of art that AI systems could create. At the heart of these systems is a technology called diffusion, which is already being used to auto-generate music and video.

article here

So what is diffusion, exactly, and why is it such a massive leap over the previous state of the art? Kyle Wiggers has done the research at TechCrunch.

We learn that earlier forms of AI technology relied on generative adversarial networks, or GANs. These proved pretty good at creating the first deepfaking apps. For example, StyleGAN, an NVIDIA-developed system, can generate high-resolution head shots of fictional people by learning attributes like facial pose, freckles and hair.

In practice, though, GANs suffered from a number of shortcomings owing to their architecture, says Wiggers. The models were inherently unstable and also needed lots of data and compute power to run and train, which made them tough to scale.

Diffusion rode to the rescue. The tech has actually been around for a decade but it wasn’t until OpenAI developed CLIP (Contrastive Language-Image Pre-Training) that diffusion became practical in everyday applications.

CLIP classifies data — for example, images — to “score” each step of the diffusion process based on how likely it is to be classified under a given text prompt (e.g. “a sketch of a dog in a flowery lawn”).

Wiggers explains that, at the start, the data has a very low CLIP-given score, because it’s mostly noise. But as the diffusion system reconstructs data from the noise, it slowly comes closer to matching the prompt.

“A useful analogy is uncarved marble — like a master sculptor telling a novice where to carve, CLIP guides the diffusion system toward an image that gives a higher score.”

OpenAI introduced CLIP alongside the image-generating system DALL-E. Since then, it’s made its way into DALL-E’s successor, DALL-E 2, as well as open source alternatives like Stable Diffusion.

So what can CLIP-guided diffusion models do? They’re quite good at generating art — from photorealistic imagery to sketches, drawings and paintings in the style of practically any artist.

Researchers have also experimented with using guided diffusion models to compose new music. Harmonai, an organization with financial backing from Stability AI, the London-based startup behind Stable Diffusion, released a diffusion-based model that can output clips of music by training on hundreds of hours of existing songs. More recently, developers Seth Forsgren and Hayk Martiros created a hobby project dubbed Riffusion that uses a diffusion model cleverly trained on spectrograms — visual representations — of audio to generate tunes.

 Researchers have also applied it to generating videos, compressing images and synthesizing speech. Diffusion may be replaced with a more efficient machine learning technique but the exploration has only just begun.