Connect with us


Meet the AI algorithms that judge how beautiful you are



I first came across Qoves Studio through its popular YouTube channel, which offers polished videos like “Does the hairstyle make a pretty face?,”What makes Timothée Chalamet attractive?,” and “How jaw alignment influences social perceptions” to millions of viewers.

Qoves started as a studio that would airbrush images for modeling agencies; now it is a “facial aesthetics consultancy” that promises answers to the “age-old question of what makes a face attractive.” Its website, which features chalky sketches of Parisian-looking women wearing lipstick and colorful hats, offers a range of services related to its plastic surgery consulting business: advice on beauty products, for example, and tips on how to enhance images using your computer. But its most compelling feature is the “facial assessment tool”: an AI-driven system that promises to look at images of your face to tell you how beautiful you are—or aren’t—and then tell you what you can do about it.

Last week, I decided to try it. Following the site’s instructions, I washed off the little makeup I was wearing and found a neutral wall brightened by a small window. I asked my boyfriend to take some close-up photos of my face at eye level. I tried hard to not smile. It was the opposite of glamorous.

I uploaded the most bearable photo, and within milliseconds Qoves returned a report card of the 10 “predicted flaws” on my face. Topping the list was a 0.7 probability of nasolabial folds, followed by a 0.69 probability of under-eye contour depression, and a 0.66 probability of periocular discoloration. In other words, it suspected (correctly) that I have dark bags under my eyes and smile lines, both of which register as problematic with the AI.

My results from the Qoves facial assessment tool

The report helpfully returned recommendations that I might take to address my flaws. First, a suggested article about smile lines informed me that they “may need injectable or surgical intervention.” If I wished, I could upgrade to a fuller report of surgical recommendations, written by doctors, at tiers of $75, $150, and $250. It also suggested five serums I could try first, each featuring a different skin-care ingredient—retinol, neuropeptides, hyaluronic acid, EGF, and TNS. I’d only heard of retinol. Before bed that night I looked through the ingredients of my face moisturizer to see what it contained.

I was intrigued. The tool had broken my appearance down into a list of bite-size issues—a laser trained on what it thought was wrong with my appearance.

Qoves, however, is just one small startup with 20 employees in an ocean of facial analysis companies and services. There is a growing industry of facial analysis tools driven by AI, each claiming to parse an image for characteristics such as emotions, age, or attractiveness. Companies working on such technologies are a darling of venture capital, and such algorithms are used in everything from online cosmetic sales to dating apps. These beauty scoring tools, readily available for purchase online, use face analysis and computer vision to evaluate things like symmetry, eye size, and nose shape to sort through and rank millions of pieces of visual content and surface the most attractive people.

These algorithms train a sort of machine gaze on photographs and videos, spitting out numerical values akin to credit ratings, where the highest scores can unlock the best online opportunities for likes, views, and matches. If that prospect isn’t concerning enough, the technology also exacerbates other problems, say experts. Most beauty scoring algorithms are littered with inaccuracies, ageism, and racism—and the proprietary nature of many of these systems means it is impossible to get insight into how they really work, how much they’re being used, or how they affect users.

Qoves recommended certain actions to fix my “predicted flaws”

“Mirror, mirror on the wall …”

Tests like the ones available from Qoves are all over the internet. One is run by the world’s largest open facial recognition platform, Face++. Its beauty scoring system was developed by the Chinese imaging company Megvii and, like Qoves, uses AI to examine your face. But instead of detailing what it sees in clinical language, it boils down its findings into a percentage grade of likely attractiveness. In fact, it returns two results: one score that predicts how men might respond to a picture, and the other that represents a female perspective. Using the service’s free demo and the same unglamorous photo, I quickly got my results. “Males generally think this person is more beautiful than 69.62% of persons” and “Females generally think this person is more beautiful than 73.877%”.

It was anticlimactic, but better than I had expected. A year into the pandemic, I can see the impact of stress, weight, and closed hair salons on my appearance. I retested the tool with two other photos of myself from Before, both of which I liked. My scores improved, nudging me near the top 25th percentile.

Beauty is often subjective and personal: our loved ones appear attractive to us when they are healthy and happy, and even when they are sad. Other times it’s a collective judgment: ranking systems like beauty pageants or magazine lists of the most beautiful people show how much we treat attractiveness like a prize. This assessment can also be ugly and uncomfortable: when I was a teenager, the boys in my high school would shout numbers from one to 10 at girls who walked past in the hallway. But there’s something eerie about a machine rating the beauty of somebody’s face—it’s just as unpleasant as shouts at school, but the mathematics of it feel disturbingly un-human.

My beauty score results from Face++

Under the hood

Although the concept of ranking people’s attractiveness is not new, the way these particular systems work is a relatively fresh development: Face++ released its beauty scoring feature in 2017.

When asked for detail on how the algorithm works, a spokesperson for Megvii would only say that it was “developed about three years ago in response to local market interest in entertainment-related apps.” The company’s website indicates that Chinese and Southeast Asian faces were used to train the system, which attracted 300,000 developers soon after it launched, but there is little other information.

A spokesperson for Megvii says that Face++ is an open-source platform and it cannot control the ways in which developers might use it, but the website suggests “cosmetic sales” and “matchmaking” as two potential applications.

The company’s known customers include the Chinese government’s surveillance system, which blankets the country with CCTV cameras, as well as Alibaba and Lenovo. Megvii recently filed for an IPO and is currently valued at $4 billion. According to reporting in the New York Times, it is one of three facial recognition companies that assisted the Chinese government in identifying citizens who might belong to the Uighur ethnic minority.

Qoves, meanwhile, was more forthcoming about how its face analysis works. The company, which is based in Australia, was founded as a photo retouching firm in 2019 but switched to a combination of AI-driven analysis and plastic surgery in 2020. Its system uses a common deep-learning technique known as a convolutional neural network, or CNN. The CNNs used to rate attractiveness typically train on a data set of hundreds of thousands of pictures that have already been manually scored for attractiveness by people. By looking at the pictures and the existing ratings, the system infers what factors people consider attractive so that it can make predictions when shown new images.

Other big companies have invested in beauty AIs in recent years. They include the American cosmetics retailer Ulta Beauty, valued at $18 billion, which developed a skin analysis tool. Nvidia and Microsoft backed a “robot beauty pageant” in 2016, which challenged entrants to develop the best AI to determine attractiveness.

According to Evan Nisselson, a partner at LDV Capital, vision technology is still in its early stages, which creates “significant investment opportunities and upside.” LDV estimates that there will be 45 billion cameras in the world by next year, not including those used for manufacturing or logistics and claims that visual data will be the key data input for AI systems in the near future. Nisselson says facial analysis is “a huge market” that will, over the course of time, involve “re-invention of the tech stack to get to the same or closer to or even better than a human’s eye.”

Qoves founder Shafee Hassan claims that beauty scoring might be even more widespread. He says that social media apps and platforms often use systems that scan people’s faces, score them for attractiveness, and give more attention to those who rank higher. “What we’re doing is doing something similar to Snapchat, Instagram, and TikTok,” he says. “but we’re making it more transparent.”

He adds: “They’re using the same neural network and they’re using the same techniques, but they’re not telling you that [they’ve] identified that your face has these nasolabial folds, it has a thin vermilion, it has all of these things, therefore [they’re] going to penalize you as being a less attractive individual.”

I reached out to a number of companies—including dating services and social media platforms—and asked whether beauty scoring is part of their recommendation algorithms. Instagram and Facebook have denied using such algorithms. TikTok and Snapchat declined to comment on the record.

conceptual illustration showing many crops of different faces


“Big black boxes”

Recent advances in deep learning have dramatically changed the accuracy of beauty AIs. Before deep learning, facial analysis relied on feature engineering, where a scientific understanding of facial features would guide the AI. The formula for an attractive face, for example, might be set to reward wide eyes and a sharp jaw. “Imagine looking at a human face and seeing a Leonardo da Vinci–style depiction of all the proportions and the spacing between the eyes and that type of thing,” says Serge Belongie, a computer vision professor at Cornell University. With the advent of deep learning, “it became all about big data and big black boxes of neural net computation that just crunched on huge amounts of labeled data,” he says. “And at the end of the day, it works better than all the other stuff that we toiled on for decades.”

But there’s a catch. “We’re still not totally sure how it works,” says Belongie. “Industry’s happy, but academia is a little puzzled.” Because beauty is highly subjective, the best a deep-learning beauty AI can do is to accurately regurgitate the preferences of the training data used to teach it. Even though some AI systems now rate attractiveness as accurately as the humans in a training set, that means the systems also display an equal amount of bias. And importantly, because the system is inscrutable, placing guardrails on the algorithm that might minimize the bias is a difficult and computationally costly task.

Belongie says there are applications of this sort of technology that are more anodyne and less problematic than scoring a face for attractiveness—a tool that can recommend the most beautiful photograph of a sunset on your phone, for example. But beauty scoring is different. “That, to me, is a very scary endeavor,” he says.

Even if training data and commercial uses are as unbiased and safe as possible, computer vision has technical limitations when it comes to human skin tones. The imagining chips found in cameras are preset to process a particular range of them. Historically “some skin tones were simply left off the table,” according to Belongie, “which means that the photos themselves may not have even been developed with certain skin tones in mind. Even the noblest of ambitions in terms of capturing all forms of human beauty may not have a chance because the brightness values aren’t even represented accurately.”

And these technical biases manifest as racism in commercial applications. In 2018, Lauren Rhue, an economist who is an assistant professor of information systems at the University of Maryland, College Park, was shopping for facial recognition tools that might aid her work studying digital platforms when she stumbled on this set of unusual products.

“I realized that there were scoring algorithms for beauty,” she says. “And I thought, that seems impossible. I mean, beauty is completely in the eye of the beholder. How can you train an algorithm to determine whether or not someone is beautiful?” Studying these algorithms soon became a new focus for her research.

Looking at how Face++ rated beauty, she found that the system consistently ranked darker-skinned women as less attractive than white women, and that faces with European-like features such as lighter hair and smaller noses scored higher than those with other features, regardless of how dark their skin was. The Eurocentric bias in the AI reflects the bias of the humans who scored the photos used to train the system, codifying and amplifying it—regardless of who is looking at the images. Chinese beauty standards, for example, prioritize lighter skin, wide eyes, and small noses.

A comparison of two photos of Beyonce Knowles from Lauren Rhue’s research using Face++. Its AI predicted the image on the left would rate at 74.776% for men and 77.914% for women. The image on the right, meanwhile, scored 87.468% for men and 91.14% for women in its model.

Beauty scores, she says, are part of a disturbing dynamic between an already unhealthy beauty culture and the recommendation algorithms we come across every day online. When scores are used to decide whose posts get surfaced on social media platforms, for example, it reinforces the definition of what is deemed attractive and takes attention away from those who do not fit the machine’s strict ideal. “We’re narrowing the types of pictures that are available to everybody,” says Rhue.

It’s a vicious cycle: with more eyes on the content featuring attractive people, those images are able to gather higher engagement, so they are shown to still more people. Eventually, even when a high beauty score is not a direct reason a post is shown to you, it is an indirect factor.

In a study published in 2019, she looked at how two algorithms, one for beauty scores and one for age predictions, affected people’s opinions. Participants were shown images of people and asked to evaluate the beauty and age of the subjects. Some of the participants were shown the score generated by an AI before giving their answer, while others were not shown the AI score at all. She found that participants without knowledge of the AI’s rating did not exhibit additional bias; however, knowing how the AI ranked people’s attractiveness made people give scores closer to the algorithmically generated result. Rhue calls this the “anchoring effect.”

“Recommendation algorithms are actually changing what our preferences are,” she says. “And the challenge from a technology perspective, of course, is to not narrow them too much. When it comes to beauty, we are seeing much more of a narrowing than I would have expected.”

“I didn’t see any reason for not evaluating your flaws, because there are ways you can fix it.”

Shafee Hassan, Qoves Studio

At Qoves, Hassan says he has tried to tackle the issue of race head on. When conducting a detailed facial analysis report—the kind that clients pay for—his studio attempts to use data to categorize the face according to ethnicity so that everyone won’t simply be evaluated against a European ideal. “You can escape this Eurocentric bias just by becoming the best-looking version of yourself, the best-looking version of your ethnicity, the best-looking version of your race,” he says.

But Rhue says she worries about this kind of ethnic categorization being embedded deeper into our technological infrastructure. “The problem is, people are doing it, no matter how we look at it, and there’s no type of regulation or oversight,” she says. “If there is any type of strife, people will try to figure out who belongs in which category.”

“Let’s just say I’ve never seen a culturally sensitive beauty AI,” she says.

Recommendation systems don’t have to be designed to evaluate for attractiveness to end up doing it anyway. Last week, German broadcaster BR reported that one AI used to evaluate potential employees displayed biases based on appearance. And in March 2020, the parent company of TikTok, ByteDance, came under criticism for a memo that instructed content moderators to suppress videos that displayed “ugly facial looks,” people who were “chubby,” those with “a disformatted face” or “lack of front teeth,” “senior people with too many wrinkles,” and more. Twitter recently released an auto-cropping tool for photographs that appeared to prioritize white people. When tested on images of Barack Obama and Mitch McConnell, the auto-cropping AI consistently cropped out the former president.

“Who’s the fairest of them all?”

When I first spoke to Qoves founder Hassan by video call in January, he told me, “I’ve always believed that attractive people are a race of their own.”

When he started out in 2019, he says, his friends and family were very critical of his business venture. But Hassan believes he is helping people become the best possible version of themselves. He takes his inspiration from the 1997 movie Gattaca, which takes place in a “not-too-distant future” where genetic engineering is the default means of conception. Genetic discrimination segments society, and Ethan Hawke’s character, who was conceived naturally, has to steal the identity of a genetically perfected person in order to get around the system.

It’s usually considered a deeply dystopian film, but Hassan says it left an unexpected mark.

“It was very interesting to me, because the whole idea was that a person can determine their fate. The way they want to look is part of their fate,” he says. “With how far modern medicine has come, I didn’t see any reason for not evaluating your flaws, because there are ways you can fix it.”

His clients seem to agree. He claims that many of them are actors and actresses, and that the company receives anywhere from 50 to 100 orders for detailed medical reports each day—so many it is having trouble keeping up with demand. For Hassan, fighting the coming “classism” between those who are deemed beautiful and those society thinks are ugly is core to his mission. “What we’re trying to do is help the average person,” he told me.

There are other ways to “help the average person,” however. Every expert I spoke to said that disclosure and transparency from companies that use beauty scoring are paramount. Belongie believes that pressuring companies to reveal the workings of their recommendation algorithms will help keep users safe. “The company should own it and say yes, we are using facial beauty prediction and here’s the model. And here’s a representative gallery of faces that we think, based on your browsing behavior, you find attractive. And I think that the user should be aware of that and be able to interact with it.” He says that features like Facebook’s ad transparency tool are a good start, but “if the companies are not doing that, and they’re doing something like Face++ where they just casually assume we all agree on beauty … there may be power brokers who simply made that decision.”

Of course, the industry would have to first confess that it uses these scoring models in the first place, and the public would have to be aware of the issue. And though the past year has brought attention and criticism to facial recognition technology, several researchers I spoke with said that they were surprised by the lack of awareness about this use of it. Rhue says the most surprising thing about beauty scoring has been how few people are examining it as a topic. She is not persuaded that the technology should be developed at all.

As Hassan reviewed my own flaws with me, he assured me that a good moisturizer and some weight loss should do the trick. And though the aesthetics of my face won’t determine my career trajectory, he encouraged me to take my results seriously.

“Beauty,” he reminded me, “is a currency.”

Lyron Foster is a Hawaii based African American Musician, Author, Actor, Blogger, Filmmaker, Philanthropist and Multinational Serial Tech Entrepreneur.

Continue Reading


Facebook faces ‘mass action’ lawsuit in Europe over 2019 breach



Facebook is to be sued in Europe over the major leak of user data that dates back to 2019 but which only came to light recently after information on 533M+ accounts was found posted for free download on a hacker forum.

Today Digital Rights Ireland (DRI) announced it’s commencing a “mass action” to sue Facebook, citing the right to monetary compensation for breaches of personal data that’s set out in the European Union’s General Data Protection Regulation (GDPR).

Article 82 of the GDPR provides for a ‘right to compensation and liability’ for those affected by violations of the law. Since the regulation came into force, in May 2018, related civil litigation has been on the rise in the region.

The Ireland-based digital rights group is urging Facebook users who live in the European Union or European Economic Area to check whether their data was breach — via the haveibeenpwned website (which lets you check by email address or mobile number) — and sign up to join the case if so.

Information leaked via the breach includes Facebook IDs, location, mobile phone numbers, email address, relationship status and employer.

Facebook has been contacted for comment on the litigation.

The tech giant’s European headquarters is located in Ireland — and earlier this week the national data watchdog opened an investigation, under EU and Irish data protection laws.

A mechanism in the GDPR for simplifying investigation of cross-border cases means Ireland’s Data Protection Commission (DPC) is Facebook’s lead data regulator in the EU. However it has been criticized over its handling of and approach to GDPR complaints and investigations — including the length of time it’s taking to issue decisions on major cross-border cases. And this is particularly true for Facebook.

With the three-year anniversary of the GDPR fast approaching, the DPC has multiple open investigations into various aspects of Facebook’s business but has yet to issue a single decision against the company.

(The closest it’s come is a preliminary suspension order issued last year, in relation to Facebook’s EU to US data transfers. However that complaint long predates GDPR; and Facebook immediately filed to block the order via the courts. A resolution is expected later this year after the litigant filed his own judicial review of the DPC’s processes).

Since May 2018 the EU’s data protection regime has — at least on paper — baked in fines of up to 4% of a company’s global annual turnover for the most serious violations.

Again, though, the sole GDPR fine issued to date by the DPC against a tech giant (Twitter) is very far off that theoretical maximum. Last December the regulator announced a €450k (~$547k) sanction against Twitter — which works out to around just 0.1% of the company’s full-year revenue.

That penalty was also for a data breach — but one which, unlike the Facebook leak, had been publicly disclosed when Twitter found it in 2019. So Facebook’s failure to disclose the vulnerability it discovered and claims it fixed by September 2019, which led to the leak of 533M accounts now, suggests it should face a higher sanction from the DPC than Twitter received.

However even if Facebook ends up with a more substantial GDPR penalty for this breach the watchdog’s caseload backlog and plodding procedural pace makes it hard to envisage a swift resolution to an investigation that’s only a few days old.

Judging by past performance it’ll be years before the DPC decides on this 2019 Facebook leak — which likely explains why the DRI sees value in instigating class-action style litigation in parallel to the regulatory investigation.

“Compensation is not the only thing that makes this mass action worth joining. It is important to send a message to large data controllers that they must comply with the law and that there is a cost to them if they do not,” DRI writes on its website.

It also submitted a complaint about the Facebook breach to the DPC earlier this month, writing then that it was “also consulting with its legal advisors on other options including a mass action for damages in the Irish Courts”.

It’s clear that the GDPR enforcement gap is creating a growing opportunity for litigation funders to step in in Europe and take a punt on suing for data-related compensation damages — with a number of other mass actions announced last year.

In the case of DRI its focus is evidently on seeking to ensure that digital rights are upheld. But it told RTE that it believes compensation claims which force tech giants to pay money to users whose privacy rights have been violated is the best way to make them legally compliant.

Facebook, meanwhile, has sought to play down the breach it failed to disclose in 2019 — claiming it’s ‘old data’ — a deflection that ignores the fact that people’s dates of birth don’t change (nor do most people routinely change their mobile number or email address).

Plenty of the ‘old’ data exposed in this latest massive Facebook leak will be very handy for spammers and fraudsters to target Facebook users — and also now for litigators to target Facebook for data-related damages.

Continue Reading


Geoffrey Hinton has a hunch about what’s next for AI



Back in November, the computer scientist and cognitive psychologist Geoffrey Hinton had a hunch. After a half-century’s worth of attempts—some wildly successful—he’d arrived at another promising insight into how the brain works and how to replicate its circuitry in a computer.

“It’s my current best bet about how things fit together,” Hinton says from his home office in Toronto, where he’s been sequestered during the pandemic. If his bet pays off, it might spark the next generation of artificial neural networks—mathematical computing systems, loosely inspired by the brain’s neurons and synapses, that are at the core of today’s artificial intelligence. His “honest motivation,” as he puts it, is curiosity. But the practical motivation—and, ideally, the consequence—is more reliable and more trustworthy AI.

A Google engineering fellow and cofounder of the Vector Institute for Artificial Intelligence, Hinton wrote up his hunch in fits and starts, and at the end of February announced via Twitter that he’d posted a 44-page paper on the arXiv preprint server. He began with a disclaimer: “This paper does not describe a working system,” he wrote. Rather, it presents an “imaginary system.” He named it, “GLOM.” The term derives from “agglomerate” and the expression “glom together.”

Hinton thinks of GLOM as a way to model human perception in a machine—it offers a new way to process and represent visual information in a neural network. On a technical level, the guts of it involve a glomming together of similar vectors. Vectors are fundamental to neural networks—a vector is an array of numbers that encodes information. The simplest example is the xyz coordinates of a point—three numbers that indicate where the point is in three-dimensional space. A six-dimensional vector contains three more pieces of information—maybe the red-green-blue values for the point’s color. In a neural net, vectors in hundreds or thousands of dimensions represent entire images or words. And dealing in yet higher dimensions, Hinton believes that what goes on in our brains involves “big vectors of neural activity.”

By way of analogy, Hinton likens his glomming together of similar vectors to the dynamic of an echo chamber—the amplification of similar beliefs. “An echo chamber is a complete disaster for politics and society, but for neural nets it’s a great thing,” Hinton says. The notion of echo chambers mapped onto neural networks he calls “islands of identical vectors,” or more colloquially, “islands of agreement”—when vectors agree about the nature of their information, they point in the same direction.

“If neural nets were more like people, at least they can go wrong the same ways as people do, and so we’ll get some insight into what might confuse them.”

Geoffrey Hinton

In spirit, GLOM also gets at the elusive goal of modelling intuition—Hinton thinks of intuition as crucial to perception. He defines intuition as our ability to effortlessly make analogies. From childhood through the course of our lives, we make sense of the world by using analogical reasoning, mapping similarities from one object or idea or concept to another—or, as Hinton puts it, one big vector to another. “Similarities of big vectors explain how neural networks do intuitive analogical reasoning,” he says. More broadly, intuition captures that ineffable way a human brain generates insight. Hinton himself works very intuitively—scientifically, he is guided by intuition and the tool of analogy making. And his theory of how the brain works is all about intuition. “I’m very consistent,” he says.

Hinton hopes GLOM might be one of several breakthroughs that he reckons are needed before AI is capable of truly nimble problem solving—the kind of human-like thinking that would allow a system to make sense of things never before encountered; to draw upon similarities from past experiences, play around with ideas, generalize, extrapolate, understand. “If neural nets were more like people,” he says, “at least they can go wrong the same ways as people do, and so we’ll get some insight into what might confuse them.”

For the time being, however, GLOM itself is only an intuition—it’s “vaporware,” says Hinton. And he acknowledges that as an acronym nicely matches, “Geoff’s Last Original Model.” It is, at the very least, his latest.

Outside the box

Hinton’s devotion to artificial neural networks (a mid-2oth century invention) dates to the early 1970s. By 1986 he’d made considerable progress: whereas initially nets comprised only a couple of neuron layers, input and output, Hinton and collaborators came up with a technique for a deeper, multilayered network. But it took 26 years before computing power and data capacity caught up and capitalized on the deep architecture.

In 2012, Hinton gained fame and wealth from a deep learning breakthrough. With two students, he implemented a multilayered neural network that was trained to recognize objects in massive image data sets. The neural net learned to iteratively improve at classifying and identifying various objects—for instance, a mite, a mushroom, a motor scooter, a Madagascar cat. And it performed with unexpectedly spectacular accuracy.

Deep learning set off the latest AI revolution, transforming computer vision and the field as a whole. Hinton believes deep learning should be almost all that’s needed to fully replicate human intelligence.

But despite rapid progress, there are still major challenges. Expose a neural net to an unfamiliar data set or a foreign environment, and it reveals itself to be brittle and inflexible. Self-driving cars and essay-writing language generators impress, but things can go awry. AI visual systems can be easily confused: a coffee mug recognized from the side would be an unknown from above if the system had not been trained on that view; and with the manipulation of a few pixels, a panda can be mistaken for an ostrich, or even a school bus.

GLOM addresses two of the most difficult problems for visual perception systems: understanding a whole scene in terms of objects and their natural parts; and recognizing objects when seen from a new viewpoint.(GLOM’s focus is on vision, but Hinton expects the idea could be applied to language as well.)

An object such as Hinton’s face, for instance, is made up of his lively if dog-tired eyes (too many people asking questions; too little sleep), his mouth and ears, and a prominent nose, all topped by a not-too-untidy tousle of mostly gray. And given his nose, he is easily recognized even on first sight in profile view.

Both of these factors—the part-whole relationship and the viewpoint—are, from Hinton’s perspective, crucial to how humans do vision. “If GLOM ever works,” he says, “it’s going to do perception in a way that’s much more human-like than current neural nets.”

Grouping parts into wholes, however, can be a hard problem for computers, since parts are sometimes ambiguous. A circle could be an eye, or a doughnut, or a wheel. As Hinton explains it, the first generation of AI vision systems tried to recognize objects by relying mostly on the geometry of the part-whole-relationship—the spatial orientation among the parts and between the parts and the whole. The second generation instead relied mostly on deep learning—letting the neural net train on large amounts of data. With GLOM, Hinton combines the best aspects of both approaches.

“There’s a certain intellectual humility that I like about it,” says Gary Marcus, founder and CEO of Robust.AI and a well-known critic of the heavy reliance on deep learning. Marcus admires Hinton’s willingness to challenge something that brought him fame, to admit it’s not quite working. “It’s brave,” he says. “And it’s a great corrective to say, ‘I’m trying to think outside the box.’”

The GLOM architecture

In crafting GLOM, Hinton tried to model some of the mental shortcuts—intuitive strategies, or heuristics—that people use in making sense of the world. “GLOM, and indeed much of Geoff’s work, is about looking at heuristics that people seem to have, building neural nets that could themselves have those heuristics, and then showing that the nets do better at vision as a result,” says Nick Frosst, a computer scientist at a language startup in Toronto who worked with Hinton at Google Brain.

With visual perception, one strategy is to parse parts of an object—such as different facial features—and thereby understand the whole. If you see a certain nose, you might recognize it as part of Hinton’s face; it’s a part-whole hierarchy. To build a better vision system, Hinton says, “I have a strong intuition that we need to use part-whole hierarchies.” Human brains understand this part-whole composition by creating what’s called a “parse tree”—a branching diagram demonstrating the hierarchical relationship between the whole, its parts and subparts. The face itself is at the top of the tree, and the component eyes, nose, ears, and mouth form the branches below.

One of Hinton’s main goals with GLOM is to replicate the parse tree in a neural net—this is would distinguish it from neural nets that came before. For technical reasons, it’s hard to do. “It’s difficult because each individual image would be parsed by a person into a unique parse tree, so we would want a neural net to do the same,” says Frosst. “It’s hard to get something with a static architecture—a neural net—to take on a new structure—a parse tree—for each new image it sees.” Hinton has made various attempts. GLOM is a major revision of his previous attempt in 2017, combined with other related advances in the field.

“I’m part of a nose!”

GLOM vector

Hinton face grid


A generalized way of thinking about the GLOM architecture is as follows: The image of interest (say, a photograph of Hinton’s face) is divided into a grid. Each region of the grid is a “location” on the image—one location might contain the iris of an eye, while another might contain the tip of his nose. For each location in the net there are about five layers, or levels. And level by level, the system makes a prediction, with a vector representing the content or information. At a level near the bottom, the vector representing the tip-of-the-nose location might predict: “I’m part of a nose!” And at the next level up, in building a more coherent representation of what it’s seeing, the vector might predict: “I’m part of a face at side-angle view!”

But then the question is, do neighboring vectors at the same level agree? When in agreement, vectors point in the same direction, toward the same conclusion: “Yes, we both belong to the same nose.” Or further up the parse tree. “Yes, we both belong to the same face.”

Seeking consensus about the nature of an object—about what precisely the object is, ultimately—GLOM’s vectors iteratively, location-by-location and layer-upon-layer, average with neighbouring vectors beside, as well as predicted vectors from levels above and below.

However, the net doesn’t “willy-nilly average” with just anything nearby, says Hinton. It averages selectively, with neighboring predictions that display similarities. “This is kind of well-known in America, this is called an echo chamber,” he says. “What you do is you only accept opinions from people who already agree with you; and then what happens is that you get an echo chamber where a whole bunch of people have exactly the same opinion. GLOM actually uses that in a constructive way.” The analogous phenomenon in Hinton’s system is those “islands of agreement.”

“Geoff is a highly unusual thinker…”

Sue Becker

“Imagine a bunch of people in a room, shouting slight variations of the same idea,” says Frosst—or imagine those people as vectors pointing in slight variations of the same direction. “They would, after a while, converge on the one idea, and they would all feel it stronger, because they had it confirmed by the other people around them.” That’s how GLOM’s vectors reinforce and amplify their collective predictions about an image.

GLOM uses these islands of agreeing vectors to accomplish the trick of representing a parse tree in a neural net. Whereas some recent neural nets use agreement among vectors for activation, GLOM uses agreement for representation—building up representations of things within the net. For instance, when several vectors agree that they all represent part of the nose, their small cluster of agreement collectively represents the nose in the net’s parse tree for the face. Another smallish cluster of agreeing vectors might represent the mouth in the parse tree; and the big cluster at the top of the tree would represent the emergent conclusion that the image as a whole is Hinton’s face. “The way the parse tree is represented here,” Hinton explains, “is that at the object level you have a big island; the parts of the object are smaller islands; the subparts are even smaller islands, and so on.”

Figure 2 from Hinton’s GLOM paper. The islands of identical vectors (arrows of the same color) at the various levels represent a parse tree.

According to Hinton’s long-time friend and collaborator Yoshua Bengio, a computer scientist at the University of Montreal, if GLOM manages to solve the engineering challenge of representing a parse tree in a neural net, it would be a feat—it would be important for making neural nets work properly. “Geoff has produced amazingly powerful intuitions many times in his career, many of which have proven right,” Bengio says. “Hence, I pay attention to them, especially when he feels as strongly about them as he does about GLOM.”

The strength of Hinton’s conviction is rooted not only in the echo chamber analogy, but also in mathematical and biological analogies that inspired and justified some of the design decisions in GLOM’s novel engineering.

“Geoff is a highly unusual thinker in that he is able to draw upon complex mathematical concepts and integrate them with biological constraints to develop theories,” says Sue Becker, a former student of Hinton’s, now a computational cognitive neuroscientist at McMaster University. “Researchers who are more narrowly focused on either the mathematical theory or the neurobiology are much less likely to solve the infinitely compelling puzzle of how both machines and humans might learn and think.”

Turning philosophy into engineering

So far, Hinton’s new idea has been well received, especially in some of the world’s greatest echo chambers. “On Twitter, I got a lot of likes,” he says. And a YouTube tutorial laid claim to the term “MeGLOMania.”

Hinton is the first to admit that at present GLOM is little more than philosophical musing (he spent a year as a philosophy undergrad before switching to experimental psychology). “If an idea sounds good in philosophy, it is good,” he says. “How would you ever have a philosophical idea that just sounds like rubbish, but actually turns out to be true? That wouldn’t pass as a philosophical idea.” Science, by comparison, is “full of things that sound like complete rubbish” but turn out to work remarkably well—for example, neural nets, he says.

GLOM is designed to sound philosophically plausible. But will it work?

Chris Williams, a professor of machine learning in the School of Informatics at the University of Edinburgh, expects that GLOM might well spawn great innovations. However, he says, “the thing that distinguishes AI from philosophy is that we can use computers to test such theories.” It’s possible that a flaw in the idea might be exposed—perhaps also repaired—by such experiments, he says. “At the moment I don’t think we have enough evidence to assess the real significance of the idea, although I believe it has a lot of promise.”

The GLOM test model inputs are ten ellipses that form a sheep or a face.

Some of Hinton’s colleagues at Google Research in Toronto are in the very early stages of investigating GLOM experimentally. Laura Culp, a software engineer who implements novel neural net architectures, is using a computer simulation to test whether GLOM can produce Hinton’s islands of agreement in understanding parts and wholes of an object, even when the input parts are ambiguous. In the experiments, the parts are 10 ellipses, ovals of varying sizes, that can be arranged to form either a face or a sheep.

With random inputs of one ellipse or another, the model should be able to make predictions, Culp says, and “deal with the uncertainty of whether or not the ellipse is part of a face or a sheep, and whether it is the leg of a sheep, or the head of a sheep.” Confronted with any perturbations, the model should be able to correct itself as well. A next step is establishing a baseline, indicating whether a standard deep-learning neural net would get befuddled by such a task. As yet, GLOM is highly supervised—Culp creates and labels the data, prompting and pressuring the model to find correct predictions and succeed over time. (The unsupervised version is named GLUM—“It’s a joke,” Hinton says.)

At this preliminary state, it’s too soon to draw any big conclusions. Culp is waiting for more numbers. Hinton is already impressed nonetheless. “A simple version of GLOM can look at 10 ellipses and see a face and a sheep based on the spatial relationships between the ellipses,” he says. “This is tricky, because an individual ellipse conveys nothing about which type of object it belongs to or which part of that object it is.”

And overall, Hinton is happy with the feedback. “I just wanted to put it out there for the community, so anybody who likes can try it out,” he says. “Or try some sub-combination of these ideas. And then that will turn philosophy into science.”

Continue Reading


Pakistan temporarily blocks social media



Pakistan has temporarily blocked several social media services in the South Asian nation, according to users and a government-issued notice reviewed by TechCrunch.

In an order titled “Complete Blocking of Social Media Platforms,” the Pakistani government ordered Pakistan Telecommunication Authority to block social media platforms including Twitter, Facebook, WhatsApp, YouTube, and Telegram from 11am to 3pm local time (06.00am to 10.00am GMT) Friday.

The move comes as Pakistan looks to crackdown against a violent terrorist group and prevent troublemakers from disrupting Friday prayers congregations following days of violent protests.

Earlier this week Pakistan banned the Islamist group Tehrik-i-Labaik Pakistan after arresting its leader, which prompted protests, according to local media reports.

An entrepreneur based in Pakistan told TechCrunch that even though the order is supposed to expire at 3pm local time, similar past moves by the government suggests that the disruption will likely last for longer.

Though Pakistan, like its neighbor India, has temporarily cut phone calls access in the nation in the past, this is the first time Islamabad has issued a blanket ban on social media in the country.

Pakistan has explored ways to assume more control over content on digital services operating in the country in recent years. Some activists said the country was taking extreme measures without much explanations.

Continue Reading