Big Data: 40 Years of Innovation, Growth & Evolution

Introduction: There was no such thing as big data at the beginning. As data increased, big data emerged.

█ 1980–2000: Early Exploration Phase of Big Data

In 1980, American science and technology journalist Alvin Toffler published his book The Third Wave.
In it, Toffler boldly periodized the history of human civilization. He argued that humanity had experienced two major waves of civilization: the first was the agricultural revolution, which spanned thousands of years; the second was the industrial revolution, which began in the 1760s.

He further proposed that with the development of information technology, humanity was about to usher in a third wave—the information revolution. In this wave, “industrialism will perish and a new civilization will rise.”

Toffler’s views caused a huge stir at the time and had a profound impact. After its publication, The Third Wave was translated into more than 30 languages and sold over 10 million copies, becoming the best-selling futurist book of all time. Toffler himself came to be regarded as one of the most influential futurists of the modern era.

So, what is the connection between Alvin Toffler, The Third Wave, and the topic of this article—big data?

A significant one. The concept of “big data” first appeared in The Third Wave.

Toffler made many bold predictions in the book, and big data was one of them (others included multinational corporations, paperless offices, prosumerism, etc.). He asserted with great confidence: “Data is wealth.” And big data would be “the grand symphony of the Third Wave.”

Toffler’s recognition of the value of data was remarkably ahead of its time. Remember, this was 1980—PCs had just emerged, hard drives were still measured in megabytes, and the volume of data was still relatively small. The information technology wave was just beginning, but Toffler had already seen the future.

In the 1990s, with the birth and explosive growth of the internet, the information revolution entered a new stage. More and more people started buying computers and accessing the internet. More research institutions and businesses began implementing IT systems and promoting their own digital transformation.

As informatization deepened, people began to realize that the data generated from research, production, and business activities was growing rapidly, putting increasing pressure on IT systems.

In the mid-1990s, Nobel laureate Jim Gray pointed out that the challenges of big data would first arise in science, not business.

In October 1997, at an IEEE conference, NASA researchers Michael Cox and David Ellsworth presented a paper describing how simulating airflow around airplanes generated massive datasets that placed a heavy burden on main memory, local disks, and remote storage.

They called this issue the “big data problem.”

Around the same time, in 1998, John Mashey, chief scientist at high-performance computing company SGI, gave a talk titled Big Data and the Next Wave of Computing at an international conference, raising similar concerns.

Mashey pointed out that with the rapid growth of data volume, four key challenges would inevitably arise: data would become difficult to understand, access, process, and organize. He also used the term “big data” to describe these challenges, attracting widespread attention in the industry.

As the scale of data continued to expand, human storage and computing technologies could no longer keep pace. The industry realized that data’s value was immeasurable—and that we needed more powerful technologies to extract it.

█ 2000–2012: Full Outbreak Phase of Big Data

In the 21st century, theoretical discussions around big data continued.

In 2001, Doug Laney, an analyst at the META Group (later acquired by Gartner), defined big data using three “V” terms: Volume, Velocity, and Variety.

The “3Vs” theory was widely adopted and became the standard for describing big data characteristics. Later, based on the 3Vs, the industry developed “4V,” “5V,” even “7V” models—adding Veracity, Value, Variability, and Visualization.

In 2002, following the 9/11 attacks, the U.S. government proposed integrating existing datasets across agencies to build a massive database that would screen records in communications, crime, education, finance, healthcare, and travel to identify suspicious individuals.

Although this project was eventually shut down over privacy concerns, it was an early attempt at building a big data system.

In the early 21st century, the internet had grown to a vast scale. The rise of social networks, the spread of e-commerce, and the digital transformation of governments and businesses led to the generation of even more data, making storage and management increasingly challenging.

From 2003 to 2006, search engine company Google published three groundbreaking papers, introducing GFS, MapReduce, and BigTable—ushering in a new era for big data.

In 2006, Yahoo engineer Doug Cutting, inspired by Google’s papers, developed the now-famous big data framework: Hadoop.

This marked the initial establishment of big data’s technical foundation, creating the necessary conditions for its rapid growth.

Big data soon entered a fast track. More and more governments and enterprises increased their investment in big data and began building preliminary systems.

In January 2009, the Indian government announced the creation of the Unique Identification Authority of India, which would scan the fingerprints, photographs, and irises of 1.2 billion people and assign each a digital ID, forming the world’s largest biometric database.

In May 2009, the Obama administration officially launched Data.gov, a key part of its “open government” initiative. It opened hundreds of thousands of datasets across about 50 categories—including agriculture, weather, finance, and employment—under three types: raw data, geospatial data, and data tools.

Later, the U.S. and Indian governments collaborated to develop an open-source version of Data.gov.

The United Nations also took steps in big data system construction.

In 2009, in response to the global financial crisis, then-UN Secretary-General Ban Ki-moon proposed an alert system to analyze how real-time data affected economic crises in poor countries. The UN also initiated projects to explore how mobile phone and social media data could be used to forecast market prices and disease outbreaks.

In the business sector, major companies like Walmart began researching how to build big data systems to enhance marketing and promotion.

At the same time, academic research into big data reached new heights.

In 2008, the Computing Community Consortium published a white paper titled Big Data Computing: Creating Revolutionary Breakthroughs in Commerce, Science and Society, detailing big data’s potential to improve governance and unlock business value. Discussion around big data began to heat up.

In 2010, Kenneth Cukier published a 14-page report in The Economist titled Data, data everywhere, offering deep insights into the big data era.

He wrote: “There is an unimaginable amount of digital information in the world, and it is growing at an extraordinary pace. From economics to science, from government to the arts, many sectors are already feeling its impact.”

In May 2011, global consulting firm McKinsey released a report titled Big Data: The Next Frontier for Innovation, Competition and Productivity.

The report stated: “Big data has become a key production factor that has permeated every industry and business function. Mining and leveraging massive data signals a new wave of productivity growth and consumer surplus.”

In 2012, Viktor Mayer-Schönberger and Kenneth Cukier co-authored the book Big Data: A Revolution That Will Transform How We Live, Work, and Think, pushing the big data concept to new heights.

Big Data was considered the pioneering work in global big data research and had major social influence. For many in China, it was the first book they read on the topic.

The authors asserted: “The data storm brought by big data is transforming human life, work, and thinking, and will lead an era of revolutionary changes in mindset, business, and management.”

Also in 2012, the World Economic Forum stated: “Data has become a new class of economic asset, like currency and gold.” This elevated the value of big data to an unprecedented level.

From then on, big data gradually became a household term and rapidly spread across all sectors.

█ 2012–Present: Upgrades and Turning Points of Big Data

In the past decade, the buzz around big data seems to have quieted. This is not because big data has become less important, but because the technology has matured and entered a stable development phase.

In government, research, and business, big data is quietly playing a vital role. It has not only changed how we process and analyze information, but also provides key references for decision-making.

The technologies associated with big data have also evolved in this period.

For example, UC Berkeley’s AMP Lab developed Spark, which supports in-memory computing and vastly outperforms MapReduce—becoming a new industry favorite.

Similarly, NoSQL systems such as HBase and Cassandra have grown rapidly, enabling large-scale data storage and access. NewSQL databases, combining the strengths of traditional SQL and NoSQL, have also gained popularity for scenarios requiring large-scale data processing and high concurrency.

The concepts of data warehouses, data lakes, and lakehouse architectures continue to evolve. The entire big data technology ecosystem—spanning data production, aggregation, analysis, and consumption—has become more powerful and complete.

Most notably, the rise of artificial intelligence has reignited the value of data.

Big data provides rich data resources for AI, while AI uses advanced algorithms to extract value from it. As one of the three pillars of AI, data (datasets) directly impacts the performance of AIGC models. Society’s focus on data has grown even stronger.

At the same time, people are trying to solve the challenges that big data brings.

Chief among these is privacy.

In May 2014, the White House released a report titled Big Data: Seizing Opportunities, Preserving Values. The report encouraged the use of data to drive social progress but also called for frameworks and research to protect individual privacy, ensure fairness, and prevent discrimination.

On May 25, 2018, the European Union enacted the General Data Protection Regulation (GDPR), a milestone in global data privacy protection. Since then, many countries around the world have introduced their own data protection laws.

Final Words

That concludes today’s article.

Over more than 40 years, big data has grown from nothing into a powerful force—proving its value and becoming an essential part of the digital world.

In the future, as digital technologies continue to advance—especially artificial intelligence—big data applications will become even broader and deeper, bringing more opportunities and challenges to every industry.

The true platinum age of data is accelerating toward us.

References:

  1. He Predicted the Third Wave and Coined the Term “Big Data”, Wu Xiaobo Channel
  2. The Rise of Big Data: From Data Accumulation to Intelligent Decision-Making Transformation, Xinbao Review
  3. Current Status and Future Trends of Big Data Development, Mei Hong
  4. Big Data Research and Application in the United States, Hu Jingguo
  5. Baidu Baike, Wikipedia, etc.
End-of-Yunze-blog

Disclaimer: This article is created by the original author. The content of the article represents their personal opinions. Our reposting is only for sharing and discussion purposes and does not imply our endorsement or agreement. If you have any objections, please get in touch with us through the provided channels.

Leave a Reply