Integrating Machine-learning Tools in Game Development Workflow

— How tools like DallE could be used in game development —

What are those Machine Learning tools?

If you did not hear about Dall-E or MidJourney by now, you definitely should. These tools mark an important milestone in the roadmap toward an independent-thinking robot. We are not there yet, but these systems show us a glimpse of the future.

Note that I am not a big supporter of the development of such technologies, and I believe other articles I wrote might express this better. But obviously, I am not the one who decides what each engineering team around the globe is doing, and since this tech is already here, apparently not everybody agrees with me : )

Anyway, these tools will ask you for a short description of something you think about, and they would provide you with a few original visualizations that match that description. These are original creations, rather than the results of a search engine.

“Who is the legal owner of an image created with a machine learning tool?

Extensive public discussion is happening nowadays around the morality of whether or not these creations can actually be counted as art. Another big question is who actually owns the intellectual property of a specific creation: is it the person who described it; the company who created the software, or perhaps even the software itself?If you are interested in this debate, here is a great place to start: an international WhatsApp group, where we discuss these topics and more.

Relevancy for game development

Until a decision will be taken, the one who activated the search is the happy owner. This fact opens a huge opportunity for all production teams around the globe.

“game development team working on art office oil painting old” with Dall-E

Game development is a very expensive process, compared to standard software development. Not only it includes significantly more assets (such as 3D models, characters, and animations), but it has a substantial effort on interactivity, which also separates it from any other media.

Components such as storytelling, leveling up, scoring, multiplayer, AI algorithms, and other mechanisms, are core functional requirements of a video game.

Above all, the team needs to work a lot in balancing the game, creating content and environments, while maintaining balance with previous content created. For most successful games, this is a neverending process in order to maintain profitability. So if there is a way to make this production cheaper or faster, it is definitely worth the investigation.

To integrate such a tool into our workflow, we should also understand how this will change our workflow. What are the redundant tasks we can remove, and which are the new ones we should care for.

Existing solutions

The following solutions already exist and are usable. Most of them are also free to use.

All work in a similar manner: you enter a textual description, and the software will generate a few results for you to download. That description is key.

Let’s see some examples.

Concept art

“futuristic modern city, flying cars, glass windows, white, top view” with Dall-E

Dall-E. offers 15 free searches per month, and each gets 4 results. You can download for free all your results.

“futuristic modern city, flying cars, glass windows, white, top view” with MidJourney

Mid-Journey. Offers endless searches during a subscription period. But as long as you are a free subscriber, anyone could use your creations.

“half elf half robot head and shoulders portrait…” with NightCafe

Night Cafe. Offers an advanced range of options in the search mechanism, as well as using different AI algorithms.

Videos

These tools are not yet fully usable as others noted here, but I felt it’s worth mentioning. The day is not far that someone will use this open-source code to create an easy-to-use open product.

Phenaki. Offers creation of long videos by describing each scene at a time.

Make-A-Video. Meta (formerly Facebook) has been working on creating a text to (short) video software. This tool will be available to create a short gifs from scratch or as improvements to static images.

Imagen. Google responded to Meta with this project. Looks like it has similar capabilities as Make-A-Video.

Generating 3D models

These are not ready yet fully functioning, but really worth mentioning.

Mirage. The tool mitigates several 3D model generators into one user interface.

Kaedim. Converts a 2D image into a 3D model with textures.

Dream Fusion. Code library was created to assist in generating 3D models.

Ambient music

Avia / SoundDraw / Amper. Creates beautiful long compositions according to defining mood and style of music. Download requires a subscription.

Voices and Speech

Typecast / Play HT. Create a voice-over speech with a nice variation on the intonation of the voice and intention. These are not the only tools out there, but these have a nice quality.

Writing

Jasper. Although I’m pretty skeptical about this tool, it claims to be able to write full articles, from just a description of the content of the next paragraph. Check it out.

Future solutions

“If a PC can design a car by itself, then it could probably do much more…

In 2017 I saw this Ted Talk. Although car design is not linked to video games, cars are very complex products to design and make (Believe me, as I have been working for the car industry for the past 3 years…). If PCs can design cars, considering all elements to be taken into account, it means that many other things can be possible.

The tools mentioned in the previous chapter were released over a period of 1 year. So you can imagine that the future has plans for more solutions coming our way. Let’s imagine some of them.

Sound effects

“sound engineer, DJ, creating music, room with cute aliens dancing, colorful, oil painting” with Dall-E

We have generated music coming, why won’t we have also sound effects? I’m quite sure it’s already on its way. Imagine how this would change the way we work. Today we spend time on search sites for a sound effect matching our sound style; or spending time and money with a sound designer, hoping they would understand the intention, follow our limitations, and have the right skill to perform… Instead, we would be spending some time with a software system until it creates the correct sound effect for us, under our set of limitations (length, style, and effect description).

Marketing video content

Imagine that someday you will be able to write a description for how you want a 15–60 sec marketing video to look like, and let a computer generate it for you. It will get there.

How much does it cost to make a marketing video for the game launch? How much does it mean for the actual success of the game? Sadly, quite a lot as well… Well, this method should allow us to create these videos with less effort, faster, and at less cost.

Storytelling and dialogues

Telling a good story has rules. But writers still need time to write their plots, dialogues, and various options. Assuming AI writing tools already exist, I believe that we will see tools that help us create a “Disney-like story with a hero flower fairy that must stop the evil Ant from destroying the world”.

Environment design and architecture

AI and machine learning are already being used for interior design and architecture. You create a space and define the purpose of that space (such as an office with 5 rooms or a living space with 1 bedroom). The software will sort out an interior division for you.

Imagine what if you could ask a PC to generate a world map according to some description you provide. Such as: 10 small islands, 5 big islands, and bridges connecting them; or a large wasteland with limited resources in a post-apocalyptic arena.

We invest many man hours to design such tasks. We start with concept boards, drafting options, briefing artists, testing scale, 3D modeling, textures, animations, and more.

We can already generate 3D models, art, and music. Creating full-concept environments is probably the next step.

Here is an example of a plugin in development that generates a city environment: