Big data is the newly vast amount of data that can be studied to show patterns, trends, and associations.
Big data refers to large data sets that can be studied to reveal patterns, trends, and associations. The vast amount of data collection avenues means that data can now come in larger quantities, be gathered much more quickly, and exist in a greater variety of formats than ever. This new, larger, and more complex data is collectively called big data.
Though there is no threshold that separates big data from traditional data, big data is generally considered to be “big” because it cannot be processed effectively and quickly enough by older data analysis tools.
Big data is broadly defined by the three Vs: volume, velocity, and variety.
Volume refers to the amount of data. Big data deals with high volumes of data.
Velocity refers to the rate at which the data is received. Big data streams at a high velocity, often directly into memory rather than being stored on a disk.
Variety refers to the wide range of data formats. Big data may be structured, semi-structured, or unstructured and can be presented as numbers, text, images, audio, and more.
Companies that process big data may also focus on other Vs, such as value, veracity, and variability.
Emerging information technology has allowed data to be collected, stored, and analyzed at unprecedented scales. The internet continues to be adopted by new users in Canada and across the globe, and developing technologies have allowed the internet to be integrated into many different products, creating numerous new data sources. The millions of people watching Netflix, using Google, or buying products online daily contribute to the increasing volume and sophistication of big data.
Smart (Internet of Things) devices: A connection to the internet enables companies to collect data through devices like smart home systems, robotic vacuum cleaners, smart TVs, mobile devices, and wearable fitness trackers that log files.
Social media: Likes, shares, posts, comments, and how long you spend looking at a post are all considered insightful data about people’s behaviour, sentiment, and preferences.
Websites: Companies or other website owners can track page visits, general locations of visitors, see how long audiences spend on a page, what links are most clicked, and cursor movement.
Business transactions: Data can come from customers purchasing products online and in person. Price, time of purchase, payment methods, and other details can inform a business about customer demand for their products.
Machinery: Even without an internet connection, machines like road cameras, sensors, and medical equipment can record information.
Health care: The health care system is full of data. Data analysts can use aggregated information on health care records, insurance, and patient summaries to drive new insights and enhance patient care.
Government: City, province, and federal governments can use data from many sources—auto traffic information, agricultural yields, weather tracking systems, demographic information from censuses, to name a few—to make policy decisions.
Big data can be used by almost any entity to gain valuable insights and make decisions about its operations. A business, for example, can analyze the data it collects to better understand customer preferences and devise impactful business strategies.
Big data in health care systems can be used to find common symptoms of diseases or decide how much staff to put on a hospital floor at any given time. Governments may use traffic data to plan new roads or track crime rates or terrorism risks to adjust their response accordingly.
Data analysts and other professionals who work with big data may use the following tools and methods:
Predictive analytics: Analysts can use data to predict the likelihood of events or trends in the future by using predictive models and machine learning technology.
Real-time analytics: Real-time analytics is the process of analyzing and using data the moment it enters a database to make decisions quickly, such as when a banking system flags a payment as potentially fraudulent when it is made out of the country.
Data mining: Data mining refers to a process that combs through huge amounts of data to find patterns, trends, and correlations. Finding relationships between data points is key to helping organizations make decisions.
Machine learning: Machine learning—a form of artificial intelligence that learns and improves itself continuously—helps predict trends and find patterns in large data sets. It can also be useful in adapting to new data influxes.
Deep learning: Deep learning is a subset of machine learning based on artificial neural networks that mimics the human brain's learning process. It is often used in speech and text recognition and computer vision technology.
Data warehouses: Data warehouses store massive amounts of historical data. The data is typically cleaned and organized and can be accessed later to be analyzed.
Hadoop: Hadoop is a software framework used to store and process vast amounts of data that can work across several clusters of computers. Hadoop’s capacity to be scaled easily and ability to store various types of data at once have made it the go-to platform for processing big data.
Apache Spark: Apache Spark is a software framework that combines data analysis with artificial intelligence. It can perform analyses on large sets of data more quickly in many cases than Hadoop.
Read more: 7 Machine Learning Projects to Build Your Skills
Data-related professions—data analysts and scientists, AI and machine learning specialists, and big data specialists—took the top three positions in the World Economic Forum’s list of top job roles with increasing demand across industries in 2020 [1]. Here’s a closer look at the jobs that use big data in different capacities.
Data analyst: A data analyst works to gather, clean, and interpret data and create data models. Data analysts can work in various industries, including business, science, and health care.
Data engineer: Data engineers create and maintain data infrastructure, including data warehouses, pipelines, and other forms of data organization that analysts can use to make predictions or other interpretations. Big data engineers do this with software that allows them to manage large volumes of data.
Data scientist: A data scientist generally uses mathematical or statistical knowledge to build algorithms, models, and other analytical tools to help organize and interpret data.
Business intelligence analyst: Business intelligence analysts parse business data, such as sales information or customer engagement metrics, to form actionable insights into a business's performance.
Operations analyst: Operations analysts gather data about operational issues in businesses or other organizations. Operations analysts can use data to find business insights and solutions to issues in production, staffing, or any other related aspect.
Marketing analyst: Marketing researchers or analysts harvest information about current or potential customers, market conditions, or competitor activities. The data collected is then used to understand how a business can respond through marketing tactics or product adjustments.
Incorporating big data into your career can bring fresh insights into your work, and data will likely only grow in importance. Several courses online can help you get started:
Learn to navigate your way around big data and get a grasp on Hadoop with UC San Diego’s Big Data Specialization.
Familiarize yourself with the basics of machine learning with a course from Stanford University.
Find out how to scale data science and machine learning for big data using Apache Spark.
World Economic Forum. "The Future of Jobs Report 2020, https://www3.weforum.org/docs/WEF_Future_of_Jobs_2020.pdf." Accessed June 20, 2024.
Editorial Team
Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...
This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.