18 Apr We trained a neural network on photos of Tottenham Hotspur …
Remember that time that we trained a computer to generate English football club names? Or how about the time I trained a predictive text algorithm on the Tottenham Hotspur writing of Barney Ronay? That was pretty great, wasn’t it?
WELL GUESS WHAT.
Neural networks and deep learning have come a long way in the past couple of years, and huge strides are being made in the field, as scientists and enthusiasts continue to develop computer programs that “think.” I’ve found them fascinating for a while now. Back when we developed the Recurrently Generated Football League, it was mostly restricted to text based output, which was funny enough for our purposes. Now it has expanded to the realm of graphics, to the point that neural networks are now generating photo-realistic portraits of fictional humans, based on a model trained on hundreds of thousands of photos of real people. These portraits are of people that do not exist, and they are almost indistinguishable from reality.
Well, that’s just great, I say. But if I fed a computer program the images of a bunch of footballers, could it create fictional Spurs players from another dimension?
I decided to find out.
Prompted, as ever, by the remarkable Janelle Shane and her blog aiweirdness.com, I used a new visual neural network entitled StyleGAN2 and tried to compile it on my nine-year old Macbook Pro. It, uh, didn’t work. But what DID work was a separate program titled Runway, that allows schlubs like me to tap into StyleGAN2 without having to compile anything finicky or use my own processing power. (You can try it yourself — there’s a free trial and subscription options.)
StyleGAN2 works best when you have a large number of photos to start with. I fired up a new Runway project and fed it all of the player headshots from Tottenham’s website, from the first team all the way down to first year scholars — everything I could find. The advantage of using this collection of data is that photos are all pretty homogenous — the players are all wearing the same kit, in the same pose, cropped in generally the same way. Computers find it easier to find patterns when things are structurally the same.
That resulted in about 60 photos or so — this was a paltry number compared the minimum 500 photos you’re supposed to use for best results. To make up the difference, I just cloned the photos a few times to get a decent sized dataset. The repeated photos might have some weird effects on the end product, but who cares — this is SCIENCE.
A few button clicks and we’re on our way! Training the model took about four hours to complete with the standard 3000 iterations, but you get to sample it as it progresses. And along the way, that’s when things started to get weird. Training takes time, and computers don’t get it right the first time. Things generally get better the longer you let the model run, which you can see as the training progresses.
Here’s a sample image from about 25% through the training. Make sure your kids don’t see this.
Oh. Oh god, what is that?! Where are their faces?! Dear Lord in Heaven, WHY?
You can see however how the model is progressing. The computer has already picked up the similarities in the jerseys, recreating readable AIA logos and blobs that could be Tottenham crests. It has picked a median skin tone and even recognizes that photos of human beings have mouths, though it currently thinks they are overly large with giant, slavering teeth. These are Tottenham players that haunt your dreams, and not in the nice “Lucas scored a hat trick in Amsterdam” way.
The good news is that things got better over time. A few hundred more iterations and we get this:
Progress! Those are definitely Tottenham home jerseys, and the players have both recognizable faces and hairstyles. There’s also some variation in both height and skin tone, even if the differences are minor.
I didn’t expect photo-realism from a small dataset, and I didn’t get it. But at the end of the 3000th iteration, I ended up with results that were, frankly, better than I anticipated.
So, some interesting points here — the computer model clearly doesn’t know what to do with eyes or mouths. In fact, the eyes are just… gone, either digitally “closed” or replace with white holes into the vacuous, empty souls of these fake players. The mouths are also more or less the same across each photo, which makes sense because none of the players are smiling in the original photos. Interestingly, the inclusion of the academy and underage teams has pretty noticeably skewed the generated players to look pretty young, probably because the facial features have been homogenized — all of the exported results look more or less like they could play for the U18s.
So the computer has a difficult time with faces, but there’s definitely differences in skin tone and hair style, and the kits are pretty sharp! There are clear variations in kit shape based on the body type of the generated player, but they’re all rendered in pretty fantastic detail. And that’s remarkable since a not-insignificant percentage of the source material included the goalkeepers in their teal keeper kits — none of the results looked anything like a keeper.
Here’s a sampling of the results, in video form.
All right, that’s… something. But could it get better? I decided to continue training the model to see if results improved with more time. And it did… in a way. About an hour into the expanded model training, I hit a high water mark. The photos are obviously not perfect — the computer still is abjectly hostile to the idea of eyes as a concept — but they’re as sharp as could be managed with the small dataset, and whoa, they’re actually recognizable as human. They’re also eerily familiar-looking. They almost want to make you shout FFS Mou, give ‘em a chance in the first team!