BİG DATA IMPLEMENTATION

Deniz Deniz Cengiz - Ağustos 11, 2017

To get the most business value from big data, it needs to be integrated
into your business processes. How can you take action based on your
analysis of big data unless you can understand the results in context with
your operational data? Differentiating your company as a result of making
good business decisions depends on many factors. One factor that is
becoming increasingly important is your capability to integrate internal
and external data sources comprised of both traditional relational data and
newer forms of unstructured data. While this may seem like a daunting task,
the reality is that you probably already have a lot of experience with data
integration. Don’t toss aside everything you have learned about delivering
data as a trusted source to your organization. You will want to place a high
priority on data quality as you move to make big data analytics actionable.
However, to bring your big data environments and enterprise data environments
together, you will need to incorporate new methods of integration that
support Hadoop and other nontraditional big data environments.
Two major categories of big data integration are covered in this chapter: the
integration of multiple big data sources in big data environments and the
integration of unstructured big data sources with structured enterprise data.
We cover the traditional forms of integration such as extract, transform, and
load (ETL) and new solutions designed for big data platforms.
Identifying the Data You Need Before you can begin to plan for integration of your big data, you need to take stock of the type of data you are dealing with. Many organizations are
recognizing that a lot of internally generated data has not been used to its full
potential in the past. By leveraging new tools, organizations are gaining new
insight from previously untapped sources of unstructured data in e-mails,
customer service records, sensor data, and security logs. In addition, much
interest exists in looking for new insight based on analysis of data that is
primarily external to the organization, such as social media, mobile phone
location, traffic, and weather.
Your analysis may require that you bring several of these big data sources
together. To complete your analysis, you need to move large amounts of data
from log files, Twitter feeds, RFID tags, and weather data feeds and integrate
all these elements across highly distributed data systems. After you complete
your analysis, you may need to integrate your big data with your operational
data. For example, healthcare researchers explore unstructured information
from patient records in combination with traditional medical record patient
data such as test results to begin improving patient care and improving
quality of care. Big data sources like information from medical devices and
clinical trials may be incorporated into the analysis as well.
As you begin your big data analysis, you probably do not know exactly what
you will find. Your analysis will go through several stages. You may begin
with petabytes of data, and as you look for patterns, you may narrow your
results. The following three stages are described in more detail:
✓ Exploratory stage
✓ Codifying stage
✓ Integration and incorporation stage
Exploratory stage
In the early stages of your analysis, you will want to search for patterns in the
data. It is only by examining very large volumes (terabytes and petabytes)
of data that new and unexpected relationships and correlations among elements
may become apparent. These patterns can provide insight into customer
preferences for a new product, for example. You will need a platform
such as Hadoop for organizing your big data to look for these patterns.
As described in Chapters 9 and 10, Hadoop is widely used as an underlying
building block for capturing and processing big data. Hadoop is designed
with capabilities that speed the processing of big data and make it possible
to identify patterns in huge amounts of data in a relatively short time. The
two primary components of Hadoop — Hadoop Distributed File System
(HDFS) and MapReduce — are used to manage and process your big data.

Bu Blogda Ara

dipl.-Ing. Deniz Cengiz

BİG DATA IMPLEMENTATION

Yorumlar

Yorum Gönder

En çok okunanlar

Cloud Computing Reference Architecture: An Overview

Cloud Architecture

Teknolojik Altyapıdan Ne Anlıyoruz?

Run SAP İş Ortağı Programı, En İyi Çözüm Operasyonunu Nasıl Sağlar?

CLOUD COMPUTING – An Overview

BİG DATA MANAGEMENT

Artırılmış Gerçeklik nedir ve hangi alanlarda kullanılıyor?

KÖRLER ÜLKESİNE KRAL OLMAK

Blockchain, sözleşmelerin dijital koda yerleştirildiği ve şeffaf paylaşılan veri tabanlarına depolandığı, silinmesi, değiştirilmesi ve düzeltilmesinden korunan bir dünyayı hayal edebiliriz.

Bilgi Sisteminin Yazılım Yetenek Olgunluk Modeli ile İlişkisi