Connect with us


Tiny four-bit computers are now all you need to train AI



Deep learning is an inefficient energy hog. It requires massive amounts of data and abundant computational resources, which explodes its electricity consumption. In the last few years, the overall research trend has made the problem worse. Models of gargantuan proportions—trained on billions of data points for several days—are in vogue, and likely won’t be going away any time soon.

Some researchers have rushed to find new directions, like algorithms that can train on less data, or hardware that can run those algorithms faster. Now IBM researchers are proposing a different one. Their idea would reduce the number of bits, or 1s and 0s, needed to represent the data—from 16 bits, the current industry standard, to only four.

The work, which is being presented this week at NeurIPS, the largest annual AI research conference, could increase the speed and cut the energy costs needed to train deep learning by more than sevenfold. It could also make training powerful AI models possible on smartphones and other small devices, which would improve privacy by helping to keep personal data on a local device. And it would make the process more accessible to researchers outside big, resource-rich tech companies.

How bits work

You’ve probably heard before that computers store things in 1s and 0s. These fundamental units of information are known as bits. When a bit is “on,” it corresponds with a 1; when it’s “off,” it turns into a 0. Each bit, in other words, can store only two pieces of information.

But once you string them together, the amount of information you can encode grows exponentially. Two bits can represent four pieces of information because there are 2^2 combinations: 00, 01, 10, and 11. Four bits can represent 2^4, or 16 pieces of information. Eight bits can represent 2^8, or 256. And so on.

The right combination of bits can represent types of data like numbers, letters, and colors, or types of operations like addition, subtraction, and comparison. Most laptops these days are 32- or 64-bit computers. That doesn’t mean the computer can only encode 2^32 or 2^64 pieces of information total. (That would be a very wimpy computer.) It means that it can use that many bits of complexity to encode each piece of data or individual operation.

4-bit deep learning

So what does 4-bit training mean? Well, to start, we have a 4-bit computer, and thus 4 bits of complexity. One way to think about this: every single number we use during the training process has to be one of 16 whole numbers between -8 and 7, because these are the only numbers our computer can represent. That goes for the data points we feed into the neural network, the numbers we use to represent the neural network, and the intermediate numbers we need to store during training.

So how do we do this? Let’s first think about the training data. Imagine it’s a whole bunch of black-and-white images. Step one: we need to convert those images into numbers, so the computer can understand them. We do this by representing each pixel in terms of its grayscale value—0 for black, 1 for white, and the decimals between for the shades of gray. Our image is now a list of numbers ranging from 0 to 1. But in 4-bit land, we need it to range from -8 to 7. The trick here is to linearly scale our list of numbers, so 0 becomes -8 and 1 becomes 7, and the decimals map to the integers in the middle. So:

You can scale your list of numbers from 0 to 1 to stretch between -8 and 7, and then round any decimals to a whole number.

This process isn’t perfect. If you started with the number 0.3, say, you would end up with the scaled number -3.5. But our four bits can only represent whole numbers, so you have to round -3.5 to -4. You end up losing some of the gray shades, or so-called precision, in your image. You can see what that looks like in the image below.

The lower the number of bits, the less detail the photo has. This is what is called a loss of precision.

This trick isn’t too shabby for the training data. But when we apply it again to the neural network itself, things get a bit more complicated.

A neural network.

We often see neural networks drawn as something with nodes and connections, like the image above. But to a computer, these also turn into a series of numbers. Each node has a so-called activation value, which usually ranges from 0 to 1, and each connection has a weight, which usually ranges from -1 to 1.

We could scale these in the same way we did with our pixels, but activations and weights also change with every round of training. For example, sometimes the activations range from 0.2 to 0.9 in one round and 0.1 to 0.7 in another. So the IBM group figured out a new trick back in 2018: to rescale those ranges to stretch between -8 and 7 in every round (as shown below), which effectively avoids losing too much precision.

The IBM researchers rescale the activations and weights in the neural network for every round of training, to avoid losing too much precision.

But then we’re left with one final piece: how to represent in four bits the intermediate values that crop up during training. What’s challenging is that these values can span across several orders of magnitude, unlike the numbers we were handling for our images, weights, and activations. They can be tiny, like 0.001, or huge, like 1,000. Trying to linearly scale this to between -8 and 7 loses all the granularity at the tiny end of the scale.

Linearly scaling numbers that span several orders of magnitude loses all the granularity at the tiny end of the scale. As you can see here, any numbers smaller than 100 would be scaled to -8 or -7. The lack of precision would hurt the final performance of the AI model.

After two years of research, the researchers finally cracked the puzzle: borrowing an existing idea from others, they scale these intermediate numbers logarithmically. To see what I mean, below is a logarithmic scale you might recognize, with a so-called “base” of 10, using only four bits of complexity. (The researchers instead use a base of 4, because trial and error showed that this worked best.) You can see how it lets you encode both tiny and large numbers within the bit constraints.

A logarithmic scale with base 10.

With all these pieces in place, this latest paper shows how they come together. The IBM researchers run several experiments where they simulate 4-bit training for a variety of deep-learning models in computer vision, speech, and natural-language processing. The results show a limited loss of accuracy in the models’ overall performance compared with 16-bit deep learning. The process is also more than seven times faster and seven times more energy efficient.

Future work

There are still several more steps before 4-bit deep learning becomes an actual practice. The paper only simulates the results of this kind of training. Doing it in the real world would require new 4-bit hardware. In 2019, IBM Research launched an AI Hardware Center to accelerate the process of developing and producing such equipment. Kailash Gopalakrishnan, an IBM fellow and senior manager who oversaw this work, says he expects to have 4-bit hardware ready for deep-learning training in three to four years.

Boris Murmann, a professor at Stanford who was not involved in the research, calls the results exciting. “This advancement opens the door for training in resource-constrained environments,” he says. It wouldn’t necessarily make new applications possible, but it would make existing ones faster and less battery-draining “by a good margin.” Apple and Google, for example, have increasingly sought to move the process of training their AI models, like speech-to-text and autocorrect systems, away from the cloud and onto user phones. This preserves users’ privacy by keeping their data on their own phone while still improving the device’s AI capabilities.

But Murmann also notes that more needs to be done to verify the soundness of the research. In 2016, his group published a paper that demonstrated 5-bit training. But the approach didn’t hold up over the years. “Our simple approach fell apart because neural networks became a lot more sensitive,” he says. “So it’s not clear if a technique like this would also survive the test of time.”

Nonetheless, the paper “will motivate other people to look at this very carefully and stimulate new ideas,” he says. “This is a very welcome advancement.”

Lyron Foster is a Hawaii based African American Musician, Author, Actor, Blogger, Filmmaker, Philanthropist and Multinational Serial Tech Entrepreneur.

Continue Reading


InSight’s heat probe has failed on Mars. Is the mission a failure?



For two years now, NASA’s InSight probe has sat on the surface of Mars, attempting to dig 5 meters (16 feet) deep in order to install the lander’s heat probe. The instrument was going to effectively take the planet’s temperature and tell scientists more about the internal thermal activity and geology of Mars. 

InSight never even got close to realizing that goal. On January 14, NASA announced that it was ending all attempts to place the heat probe underground. Affectionately referred to as “the mole,” the probe is designed to dig underground with a hammering action. But after the first month of its mission, it  was unable to burrow more than 14 inches into the ground before getting stuck. NASA has been working since to come up with some kind of solution, including using InSight’s robotic arm to pin the mole down with added weight to help it loosen up some dirt and get back to burrowing.

It never really worked. The Martian dirt has proved to be unexpectedly prone to clumping up, diminishing the sort of friction the mole needs to spike its way deeper and deeper. Ground crews came up with a last-ditch effort recently to use InSight’s arm to scoop some soil onto the probe to tether it down and provide more friction. After attempting 500 hammer strokes on January 9, the team soon realized there was no progress to be had. 

It’s discouraging news, given that NASA just recently decided to extend InSight’s mission to December 2022. During that time, there won’t be much of a role for the heat probe. Bruce Banerdt, the InSight principal investigator, says that the planet’s temperature could still be measured at the surface and a few inches below the surface using some of the instruments on InSight that still work. “This will allow us to determine the thermal conductivity of the near surface, which might vary with season due to changing atmospheric pressure,” he says.

An illustration of how InSight’s mole was supposed to be deployed on Mars.

And while the mole was unable to accomplish what was expected, it’s not accurate to see this as a failure. “We have encountered new soil properties that have never before been encountered on Mars, with a thick, crusty surface layer that decreases its volume substantially when crushed,” says Banerdt. “We do not yet understand everything we have seen, but geologists will be poring over this data for years to come, using it to tease out clues to the history of the Martian environment at this location.”

InSight will continue on with some of its other investigations, especially the measurement of seismic activity on Mars. It turns out the Red Planet is rocked by quakes all the time.

Continue Reading


Fintech startups and unicorns had a stellar Q4 2020



The fourth quarter of 2020 was as busy as you imagined, with super late-stage startups reaching new valuation thresholds at a record pace, and total venture capital funding in the United States recording its second-best result of all time.

That’s according to data released recently by CB Insights, which complements our look back at 2020’s venture capital year in America from yesterday.

At the time, we noted that American startups raised an average of $428 million each day last year, a sum that helps illustrate how rapid the private markets moved during the odd period.

The Exchange explores startups, markets and money. Read it every morning on Extra Crunch, or get The Exchange newsletter every Saturday.

But a peek at aggregate results for the world’s largest VC market provides only part of the picture. We need to narrow our lens and peer more deeply into standout categories to understand how the U.S. venture capital market managed to post its biggest year ever in terms of dollars invested, despite seeing deal volume slip for a second consecutive year.

This morning, we’re scraping data together to better understand.

First, we want to how unicorns performed in Q4 2020. This column noted in late December that it felt like unicorn creation was rapid in the quarter; how did that hold up?

And then we’ll take a look dig into PitchBook data concerning the fintech sector, a huge recipient of venture capital time, attention and money.

Fintech’s 2020 is a good perspective to view both the year and its wild final quarter. So this morning, as America itself resets, let’s take a moment to understand last year just a little bit better as we get into this new one.


One of the most curious things about the unicorn era is the rising bet it represents. I’ve written about this before so I will be brief: Nearly every quarter, the number of unicorns — private companies worth $1 billion or more — goes up.

The private market is able to create more unicorns than it has been historically able to exit them.

Some of these companies exit, sometimes in group fashion. But, quarter after quarter, the number of unexited unicorns rises. This means that the bet on expected future liquidity from venture capitalists and other private investors keeps ratcheting higher.

Continue Reading


MIT develops method for lab-grown plants that eventually lead to alternatives to forestry and farming



Researchers at MIT have developed a new method for growing plant tissues in a lab – sort of like how companies and researchers are approaching lab-grown meat. The process would be able to produce wood and fibre in a lab environment, and researchers have already demonstrated how it works in concept by growing simple structures using cells harvested from zinnia leaves.

This work is still in its very early stages, but the potential applications of lab-grown plant material are significant, and include possibilities in both agriculture and in ruction materials. While traditional agricultural is much less ecologically damaging when compared to animal farming, it can still have a significant impact and cost, and it takes a lot of resources to maintain. Not to mention that even small environmental changes can have a significant effect on crop yield.

Forestry, meanwhile, has much more obvious negative environmental impacts. If the work of these researchers can eventually be used to create a way to produce lab-grown wood for use in construction and fabrication, in a way that’s scalable and efficient, then there’s tremendous potential in terms of reducing the impact of forestry globally. Eventually, the team even theorizes you could coax the growth of plant-based materials into specific target shapes, so you could also do some of the manufacturing in the lab, by growing a wood table directly for instance.

There’s still a long way to go from what the researchers have achieved. They’ve only grown materials on a very small scale, and will look to figure out ways to grow plant-based materials with different final properties as one challenge. They’ll also need to overcome significant barriers when it comes to scaling efficiencies, but they are working on solutions that could address some of these difficulties.

Lab-grown meat is still in its infancy, and lab-grown plant material is even more nascent. But it has tremendous potential, even if it takes a long time to get there.

Continue Reading