Connect with us

Uncategorized

This avocado armchair could be the future of AI

Published

on

With GPT-3, OpenAI showed that a single deep-learning model could be trained to use language in a variety of ways simply by throwing it vast amounts of text. It then showed that by swapping text for pixels, the same approach could be used to train an AI to complete half-finished images. GPT-3 mimics how humans use words; Image GPT-3 predicts what we see.    

Now OpenAI has put these ideas together and built two new models, called DALL·E and CLIP, that combine language and images in a way that will make AIs better at understanding both words and what they refer to.

“We live in a visual world,” says Ilya Sutskever, chief scientist at OpenAI. “In the long run, you’re going to have models which understand both text and images. AI will be able to understand language better because it can see what words and sentences mean.”

For all GPT-3’s flair, its output can feel untethered from reality, as if it doesn’t know what it’s talking about. That’s because it doesn’t. By grounding text in images, researchers at OpenAI and elsewhere are trying to give language models a better grasp of the everyday concepts that humans use to make sense of things.

DALL·E and CLIP come at this problem from different directions. At first glance, CLIP (Contrastive Language-Image Pre-training) is yet another image recognition system. Except that it has learned to recognize images not from labeled examples in curated data sets, as most existing models do, but from images and their captions taken from the internet. It learns what’s in an image from a description rather than a one-word label such as “cat” or “banana.”

CLIP is trained by getting it to predict which caption from a random selection of 32,768 is the correct one for a given image. To work this out, CLIP learns to link a wide variety of objects with their names and the words that describe them. This then lets it identify objects in images outside its training set. Most image recognition systems are trained to identify certain types of object, such as faces in surveillance videos or buildings in satellite images. Like GPT-3, CLIP can generalize across tasks without additional training. It is also less likely than other state-of-the-art image recognition models to be led astray by adversarial examples, which have been subtly altered in ways that typically confuse algorithms even though humans might not notice a difference.

Instead of recognizing images, DALL·E (which I’m guessing is a WALL·E/Dali pun) draws them. This model is a smaller version of GPT-3 that has also been trained on text-image pairs taken from the internet. Given a short natural-language caption, such as “a painting of a capybara sitting in a field at sunrise” or “a cross-section view of a walnut,” DALL·E generates lots of images that match it: dozens of capybaras of all shapes and sizes in front of orange and yellow backgrounds; row after row of walnuts (though not all of them in cross-section). 

Get surreal

The results are striking, though still a mixed bag. The caption “a stained glass window with an image of a blue strawberry” produces many correct results but also some that have blue windows and red strawberries. Others contain nothing that looks like a window or a strawberry. The results showcased by the OpenAI team in a blog post have not been cherry-picked by hand but ranked by CLIP, which has selected the 32 DALL·E images for each caption that it thinks best match the description.   

“Text-to-image is a research challenge that has been around a while,” says Mark Riedl, who works on NLP and computational creativity at the Georgia Institute of Technology in Atlanta. “But this is an impressive set of examples.”

Images drawn by DALL·E for the caption “A baby daikon radish in a tutu walking a dog”

To test DALL·E’s ability to work with novel concepts, the researchers gave it captions that described objects they thought it would not have seen before, such as “an avocado armchair” and “an illustration of a baby daikon radish in a tutu walking a dog.” In both these cases, the AI generated images that combined these concepts in plausible ways.

The armchairs in particular all look like chairs and avocados. “The thing that surprised me the most is that the model can take two unrelated concepts and put them together in a way that results in something kind of functional,” says Aditya Ramesh, who worked on DALL·E. This is probably because a halved avocado looks a little like a high-backed armchair, with the pit as a cushion. For other captions, such as “a snail made of harp,” the results are less good, with images that combine snails and harps in odd ways.

DALL·E is the kind of system that Riedl imagined submitting to the Lovelace 2.0 test, a thought experiment that he came up with in 2014. The test is meant to replace the Turing test as a benchmark for measuring artificial intelligence. It assumes that one mark of intelligence is the ability to blend concepts in creative ways. Riedl suggests that asking a computer to draw a picture of a man holding a penguin is a better test of smarts than asking a chatbot to dupe a human in conversation, because it is more open-ended and less easy to cheat.   

“The real test is seeing how far the AI can be pushed outside its comfort zone,” says Riedl. 

Images drawn by DALL·E for the caption “snail made of harp”

“The ability of the model to generate synthetic images out of rather whimsical text seems very interesting to me,” says Ani Kembhavi at the Allen Institute for Artificial Intelligence (AI2), who has also developed a system that generates images from text. “The results seems to obey the desired semantics, which I think is pretty impressive.” Jaemin Cho, a colleague of Kembhavi’s, is also impressed: “Existing text-to-image generators have not shown this level of control drawing multiple objects or the spatial reasoning abilities of DALL·E,” he says.

Yet DALL·E already shows signs of strain. Including too many objects in a caption stretches its ability to keep track of what to draw. And rephrasing a caption with words that mean the same thing sometimes yields different results. There are also signs that DALL·E is mimicking images it has encountered online rather than generating novel ones.

“I am a little bit suspicious of the daikon example, which stylistically suggests it may have memorized some art from the internet,” says Riedl. He notes that a quick search brings up a lot of cartoon images of anthropomorphized daikons. “GPT-3, which DALL·E is based on, is notorious for memorizing,” he says.

Still, most AI researchers agree that grounding language in visual understanding is a good way to make AIs smarter.  

“The future is going to consist of systems like this,” says Sutskever. “And both of these models are a step toward that system.”

Lyron Foster is a Hawaii based African American Musician, Author, Actor, Blogger, Filmmaker, Philanthropist and Multinational Serial Tech Entrepreneur.

Continue Reading
Comments

Uncategorized

AT&T may keep majority ownership of DirecTV as it closes in on final deal

Published

on

A DirecTV satellite dish mounted to the outside of a building.

Enlarge / A DirecTV satellite dish seen outside a bar in Portland, Oregon, in October 2019. (credit: Getty Images | hapabapa)

AT&T is reportedly closing in on a deal to sell a stake in DirecTV to TPG, a private-equity firm.

Unfortunately for customers hoping that AT&T will relinquish control of DirecTV, a Reuters report on Friday said the pending deal would give TPG a “minority stake” in AT&T’s satellite-TV subsidiary. On the other hand, a private-equity firm looking to wring value out of a declining business wouldn’t necessarily be better for DirecTV customers than AT&T is.

It’s also possible that AT&T could cede operational control of DirecTV even if it remains the majority owner. CNBC in November reported on one proposed deal in which “AT&T would retain majority economic ownership of the [DirecTV and U-verse TV] businesses, and would maintain ownership of U-verse infrastructure, including plants and fiber,” while the buyer of a DirecTV stake “would control the pay-TV distribution operations and consolidate the business on its books.”

Read 6 remaining paragraphs | Comments

Continue Reading

Uncategorized

Fintechs could see $100 billion of liquidity in 2021

Published

on

Three years ago, we released the first edition of the Matrix Fintech Index. We believed then, as we do now, that fintech represents one of the most exciting major innovation cycles of this decade. In 2020, all the long-term trends forcing change in this sector continued and even accelerated.

The broad movement away from credit toward debit, particularly among younger consumers, represents one such macro shift. However, the pandemic also created new, unforeseen drivers. Among them, millennials decamped from their rentals in crowded cities to accelerate their first home purchase, to the benefit of proptech companies and challenger mortgage players alike.

E-commerce saw an enormous acceleration in growth rates, furthering adoption of online payments platforms. Lastly, low interest rates and looming inflation helped pave the way for the price of Bitcoin to charge toward $30,000. In short, multiple tailwinds combined to produce a blockbuster year for the category.

In this year’s refresh of the Matrix Fintech Index, we’ll divide our attention into three parts. First, a look at the public stocks’ performance. Second, liquidity. Third, we highlight one major trend in the sector: Buy Now Pay Later, or BNPL.

Public fintech stocks rose 97% in 2020

For the fourth straight year, the publicly traded fintechs massively outperformed the incumbent financial services providers as well as every mainstream stock index. While the underlying performance of these companies was strong, the pandemic further bolstered results as consumers avoided appearing in-person for both shopping and banking. Instead, they sought — and found — digital alternatives.

For the fourth straight year, the publicly traded fintechs massively outperformed the incumbent financial services providers as well as every mainstream stock index.

Our own representation of the public fintechs’ performance is the Matrix Fintech Index — a market cap-weighted index that tracks the progress of a portfolio of 25 leading public fintech companies. The Matrix fintech Index rose 97% in 2020, compared to a 14% rise in the S&P 500 and a 10% drop for the incumbent financial service companies over the same time period.

 

2020 performance of individual fintech companies vs. SPX

2020 performance of individual fintech companies versus S&P 500. Image Credits: PitchBook

 

Fintech incumbents and new entrants vs. the S&P 500

Fintech incumbents and new entrants versus the S&P 500. Image Credits: PitchBook

E-commerce undoubtedly stood out as a major driver. As a category, retail e-commerce grew 35% YoY as of Q3, propelling PayPal and Shopify to add over $160 billion of market capitalization over the year. For its part, PayPal in the third quarter signed up 15 million net new active accounts (its highest ever).

Continue Reading

Uncategorized

Walking with Dolly

Published

on

A walk is, more often than not, a solitary experience. As far as the age of COVID-19 is concerned, that’s probably more bug than feature. It’s a way to escape the confines of a shutdown for a few glorious moments, to get some air and, for better or worse, reflect on the day that’s passed or the one to come.

It can, like many things these days, however, be isolating.

For me, long weekend walks have been a sort of lifesaver throughout this bizarre year. Following two months of being completely sidelined over (non-COVID) health issues, I began walking more per week than I ever have. It was a slow process at first — frankly, never leaving my one-bedroom apartment for April and May made it so it was physically painful to walk around the block when I finally felt comfortable going outside.

These days, I walk every morning, regularly crossing the bridge into Brooklyn and Manhattan. Until I started using Apple’s new Fitness+ service a few times a day, it was easily my main source of exercise. In November, however, my Apple Watch Activity bars swapped the more generic gray for the Fitness+ yellow. But even as I’ve made a point to do a couple of indoor exercises a day, I still start each day with a walk. Rain, snow, this weekend’s sub-freezing weather — skipping a day would feel like breaking a promise to myself.

My actual bars (not sure what happened in September — maybe testing a competitor’s device)

This morning Apple dropped the first five installments (episodes?) of Time to Walk. The feature is an attempt to expand the Fitness+ experience beyond the confines of its titular iOS app. A largely Watch-based experience, the feature leverages much of the wearable’s existing features (and Apple’s growing software ecosystem) to offer a more tailored and multimedia experience than you would get listening to a podcast or music alone.

As with the canny arrival of Fitness+ (December) and handwashing for watchOS (September), Apple says the timing was something of a happy coincidence. The company had been working on the feature well before COVID-19 entered the picture.

“Everything from Time to Walk and our launch of Fitness+ was something we had been working on well before COVID,” the company’s senior director of Fitness Technologies Jay Blahnik tells TechCrunch. “From the very beginning, we thought of Fitness+ as a place where everyone was welcome. We wanted it to feel like a place where, whether you’re new to fitness or very fit, there was something for everyone.”

For many, a walk (or push, in the case of those who use a wheelchair for mobility) is square one when it comes to daily workouts. For my part, I was certainly far more comfortable taking quick strolls around the neighborhood. With limits on space and no real exercise equipment to speak of beyond a kettlebell and yoga mat, attempting to approximate the gym experience at home has seemed a fool’s errand.

April found me trying some YouTube yoga classes with limited efficacy. Like most attempts to exercise, it didn’t stick. Walking every day was the only thing that did. And for the first time in my life, COVID-19 found me walking without any particular destination in mind. That old cliché about it being about the journey not the destination is fine when you don’t mind constantly being late to meetings. Walking for the sake of itself, however, changes the dynamic significantly. I speak to artists, writers and musicians on a regular basis for my podcast. The common sentiment is a familiar one: You simply can’t force creativity. But for those who make a point to regularly walk and run, it’s perhaps the most surefire way to kickstart the process.

Time to Walk is Apple’s attempt to capture some of that lightning in a bottle — to follow a rotating cast of big names as they walk through locations that mean something to them. The company says it’s been making an effort to meet guests where they are and essentially coach them through the process. The ability to do so is, of course, depends on their given location — especially with all of the sorts of travel restrictions that have been in place since early last year.

Ultimately, Apple says, the decisions of where to record are made by the guests. “Some guests said, ‘this is where I want to go,’ ” says Blahnik. “And some guests were like, ‘no, I want to to do the walk I normally do.’ For us, it’s not about Shawn Mendes in the Grand Canyon, it’s about where they want to go. Sometimes that’s limited by COVID, but what we found delightful was for many people, they loved to take the walk they loved to take.”

The first four guests — Mendes, Dolly Parton, Draymond Green and Uzo Aduba — run the gamut on approaches. “We think about the stories, we think about the diverse guest,” says Blahnik. We think about all of the ways you’d like the conversations to go. But what was important to us was that the idea resonated with them. The idea of going out for a walk, having a lovely conversation and hearing stories that could give you a different perspective.”

Parton, who turned 75 earlier this month, recorded her session in a studio — in contrast to the other three names. She relates a handful of stories largely revolving around her upbringing in Sevier County (pronounced “severe”), Tennessee. There’s a story about a Christmas tree and one about opening a literacy center with the help of her father (who struggled with his own ability to read and write).

She somewhat self-effacingly relates a story about the time her hometown erected a statue of her. “So I went home, and I said, ‘Daddy, did you know they’re putting a statue of me? Do you know about the statue down at the courthouse?’ ” Parton explains. “And Daddy said, ‘Well, yeah, I heard about that.’ He said, ‘Now, to your fans out there, you might be some sort of an idol. But to them pigeons, you ain’t nothing but another outhouse.’ ”

According to Parton, her father would visit the statue at night with a bucket of soap and water to clean the pigeons’ mess off his daughter’s likeness. Her segment culminates with something approaching a behind the music-style segment, describing stories behind three of her own songs: “Coat of Many Colors,” “Circle of Love” and “9 to 5.” The latter is the real gem of the bunch, contrasting her morning routines to costars Jane Fonda and Lily Tomlin, while describing the role her acrylic nails played in the songwriting and recording process.

Image Credits: Apple

Green’s stories are more emblematic of the rest. On a walk around Malibu, the Warriors power forward discusses some inspiration stories on and off the court, from being told he would never be a star to a time he tried and failed to cheat on a test in school. The stories are purposefully personal. Aduba relates some of her own struggles to break into acting, as she walks her amusingly named dog Fenway Bark through Fort Green Park in Brooklyn.

The guests share images relating to their stories or snapshots of where they go on their walks, which are delivered to the wrist with a haptic buzz. At they end of the journey, they share three handpicked songs that can be saved to a playlist on Apple Music, similar to what the company has done for its Fitness+ workouts.

Write-ups of the Time to Walk have thus far compared it to podcasting — understandably so, given that it’s an on-demand, audio-first experience. Though the feature, which downloads directly onto the Watch when the new installment drops once a week, has its own flavor, according to Apple.

“Often podcasts are hosted,” Blahnik says, by way of distinction. “In our journey to build out this experience, we certainly considered if there should be a host walking with this person. What we realized is that, for what we were trying to create, the intimacy of having the singular guest talk to you felt a lot more like you were on a walk with them. The notion that it’s not happening in a studio (in almost all cases), that they’re walking someplace that inspires them. You’ll hear that with Draymond and Shawn — with Shawn he’s huffing and puffing up that hill and it’s kind of nice because you’re in that moment together.”

Time to Walk isn’t raw, exactly. It is an Apple production, after all. The company’s certainly not tossing out found audio here. But the content does seem more off-the-cuff than many of its productions, even as it’s packaged together with a slick intro and a trio of songs at the end. But it’s a nice change of pace for those looking for something that feels a little more personal than we’re accustom to from some of the names involved.

Your own mileage will vary, depending on, among other things, your interest in the guest. Though, there’s always a chance someone you’ve never been particularly interested in — or even heard of — will offer some unique tidbit or interesting way of looking at things. That’s one of the potential upsides of having Apple doing the curating here — there’s some interesting potential for discovery. And even in the case of artists you’re familiar with, there’s good potential to discover something new.

The weekly 20 to 45-minute audio supplement won’t make the actual act of walking any less solitary — but for a little while, at least, it’s nice to feel like someone’s along for the ride.

Continue Reading

Trending