Mysql Bigdata

download Mysql Bigdata

of 25

  • date post

    14-Jul-2016
  • Category

    Documents

  • view

    2
  • download

    0

Embed Size (px)

description

Mysql Bigdata

Transcript of Mysql Bigdata

  • Copyright 2015, Oracle and/or its affiliates. All rights reserved.

    Unlocking New Big Data Insights with MySQL

    A MySQL Whitepaper

  • Copyright 2015, Oracle and/or its affiliates. All rights reserved.

    Page 2

    Table of Contents

    Introduction .................................................................................................... 3

    1. Defining Big Data .................................................................................... 3

    2. The Internet of things ............................................................................. 4

    3. The Lifecycle of Big Data ....................................................................... 6

    Step 1: Acquire Data .................................................................................. 8

    Step 2: Organize Data ............................................................................. 14

    Step 3: Analyze Data ............................................................................... 17

    Step 4: Decide ......................................................................................... 18

    4. MySQL Big Data Best Practices .......................................................... 20

    Conclusion .................................................................................................... 25

    Additional Resources .................................................................................. 25

  • Copyright 2015, Oracle and/or its affiliates. All rights reserved.

    Page 3

    Introduction

    Today the terms Big Data and Internet of Things draw a lot of attention, but behind the hype there's a simple story. For decades, companies have been making business decisions based on traditional enterprise data. Beyond that critical data, however, is a potential treasure trove of additional data: weblogs, social media, email, sensors, photographs and much more that can be mined for useful information. Decreases in the cost of both storage and compute power have made it feasible to collect this data - which would have been thrown away only a few years ago. As a result, more and more organizations are looking to include non-traditional yet potentially very valuable data with their traditional enterprise data in their business intelligence analysis. As the worlds most popular open source database, and the leading open source database for Web-based and Cloud-based applications, MySQL is a key component of numerous big data platforms. This whitepaper explores how you can unlock extremely valuable insights using MySQL with the Hadoop platform.

    1. Defining Big Data

    Big data typically refers to the following types of data:

    Traditional enterprise data includes customer information from CRM systems, transactional ERP data, web store transactions, and general ledger data.

    Machine-generated /sensor data includes Call Detail Records (CDR), weblogs, smart meters, manufacturing sensors, equipment logs (often referred to as digital exhaust) and trading systems data.

    Social data includes customer feedback streams, micro-blogging sites like Twitter, social media platforms like Facebook.

    The McKinsey Global Institute estimates that data volume is growing 40% per year

    1. But while its

    often the most visible parameter, volume of data is not the only characteristic that matters. We often refer to the Vs defining big data:

    Volume. Machine-generated data is produced in much larger quantities than non-traditional data. For instance, a single jet engine can generate 10TB of data in 30 minutes. With more than 25,000 airline flights per day, the daily volume of just this single data source runs into the Petabytes. Smart meters and heavy industrial equipment like oil refineries and drilling rigs generate similar data volumes, compounding the problem.

    Velocity. Social media data streams while not as massive as machine-generated data produce a large influx of opinions and relationships valuable to customer relationship

    1 Big data: The next frontier for innovation, competition, and productivity: McKinsey Global Institute 2011

  • Copyright 2015, Oracle and/or its affiliates. All rights reserved.

    Page 4

    management. Even at 140 characters per tweet, the high velocity (or frequency) of Twitter data ensures large volumes.

    Variety. Traditional data formats tend to be relatively well defined by a data schema and change slowly. In contrast, non-traditional data formats exhibit a dizzying rate of change. As new services are added, new sensors deployed, or new marketing campaigns executed, new data types are needed to capture the resultant information.

    The Importance of Big Data When big data is distilled and analyzed in combination with traditional enterprise data, organizations can develop a more thorough and insightful understanding of their business, which can lead to enhanced productivity, a stronger competitive position and greater innovation all of which can have a significant impact on the bottom line. For example, retailers usually know who buys their products. Use of social media and web log files from their ecommerce sites can help them understand who didnt buy and why they chose not to, information not formerly available to them. This can enable much more effective micro customer segmentation and targeted marketing campaigns, as well as improve supply chain efficiencies through more accurate demand planning. Other common use cases include:

    Sentiment analysis

    Marketing campaign analysis

    Customer churn modeling

    Fraud detection

    Research and Development

    Risk Modeling

    And more

    2. The Internet of things

    The Big Data imperative is compounded by the Internet of Things, generating an enormous amount of additional data. The devices we use are getting smaller and smarter. Theyre connecting more easily, and theyre showing up in every aspect of our lives. This new reality in technology called the Internet of Thingsis about collecting and managing the massive amounts of data from a rapidly growing network of devices and sensors, processing that data, and then sharing it with other connected things. Its the technology of the future, but you probably have it nowin the smart meter from your utility company, in the environmental controls and security systems in your home, in your activity wristband or in your cars self-monitoring capabilities.

  • Copyright 2015, Oracle and/or its affiliates. All rights reserved.

    Page 5

    Gartner estimates the total economic value-add from the Internet of Things across industries will reach US$1.9 trillion worldwide in 2020

    2.

    For example, just a few years from now, your morning routine might be a little different thanks to Internet of Things technology. Your alarm goes off earlier than usual because your home smart hub has detected traffic conditions suggesting an unusually slow commute. The weather sensor warns of a continued high pollen count, so because of your allergies, you decide to wear your suit with the sensors that track air quality and alert you to allergens that could trigger an attack.

    You have time to check your messages at the kitchen e-screen. The test results from your recent medical checkup are in, and theres a message from your doctor that reiterates his recommendations for a healthier diet. You send this information on to your home smart hub. It automatically displays a chart comparing your results with those of the general population in your age range, and asks you to confirm the change to healthier options on your online grocery order. The e-screen on the refrigerator door suggests yogurt and fresh fruit for breakfast.

    Major Advances in Machine-to-Machine Interactions Mean Incredible Changes

    The general understanding of how things work on the internet is a familiar pattern: humans connect through a browser to get the information or do the action they want to do on the internet.

    The Internet of Things changes that model. In the Internet of Things, things talk to things, and processes have two-way interconnectivity so they can interoperate both locally and globally. Decisions can be made according to predetermined rules, and the resulting actions happen automatically without the need for human intervention. These new interactions are driving tremendous opportunities for new services.

    2 Peter Middleton, Peter Kjeldsen, and Jim Tully, Forecast: The Internet of Things, Worldwide, 2013,

    (G00259115), Gartner, Inc., November 18, 2013.

  • Copyright 2015, Oracle and/or its affiliates. All rights reserved.

    Page 6

    The Value of Data

    Transforming data into valuable information is no small task. The variables and the risks are real and often uncharted; flexibility and time to market can mean the difference between failure and success. But, with the considerable potential of this developing market, some businesses are aggressively undertaking the challenges. These businessesthe ones planning now for this new technologywill be the ones to succeed and thrive. Oracle delivers an integrated, secure, comprehensive platform for the entire IoT architecture across all vertical markets. For more information on Oracles Internet of Things platform, visit: http://www.oracle.com/us/solutions/internetofthings/overview/index.html

    We shall now consider the lifecycle of Big Data, and how to leverage the Hadoop platform to derive added value from data acquired in MySQL solutions.

    3. The Lifecycle of Big Da