Ten Big Data Best Practices

Deniz Deniz Cengiz - Ağustos 14, 2017

While we are at an early stage in the evolution of big data, it is never too early to get started with good practices so that you can leverage what you are learning and the experience you are gaining. As with every important emerging technology, it is important to understand why you need to leverage the technology and have a concrete plan in place. In this chapter, we provide you with the top-ten best practices you need to understand as you begin the journey to manage big data.

Understand Your Goals
Many organizations start their big data journey by experimenting with a single project that might provide some concrete benefit. By selecting a project, you have the freedom of testing without risking capital expenditures. However, if all you end up doing is a series of one-off projects, you will likely not have a good plan in place when you begin to understand the value of leveraging big data
in the company. Therefore, after you conclude some experiments and have a good initial understanding of what might be possible, you need to set some goals — both short- and long-term. What do you hope to accomplish with big data? Could parts of your business be more profitable with the infusion of more data to predict customer behavior or buying patterns? It is important to have a collaboration between IT and business units to come up with welldefined goals.

After you understand the goals you have for leveraging big data, your work is just beginning. You now need to get to the meat of the issues. You need to involve all the stakeholders in the business. Big data affects every aspect of your organization, including the historical data that you already store, the information sources managed by different business units. New data sources may be considered in some business areas that few managers are even aware of. Getting a task force together is a great way to get representatives of the business together so that they can see how their data management issues
are related. This team can evolve into a team that can help various business units with best practices. The task force should have representatives from upper-management leaders who are setting business strategy and direction.

Establish a Road Map
At this stage, you have experimented with big data and determined your company’s goals and objectives. You have a good understanding of what upper management and business units need to accomplish. It is time to establish a road map. Your road map is your action plan. You clearly can’t do all the projects and meet all the demands from your company simultaneously. Your road map needs to begin with the set of foundational services that can help your company get started. Part of your road map should include the existing data services. Make sure that your road map has benchmarks that are reasonable and achievable. If you take on too much, you will not be able to demonstrate to management that you are executing well. Therefore, you don’t need a ten-year road map. Begin with a one- to two-year road map with well-defined goals and outcomes. You should include both business and technical goals as part of the road map.

Discover Your Data
No company ever complains that it has too little data. In reality, companies are swimming in data. The problem is that companies often don’t know how to use that data pragmatically to be able to predict the future, execute on important business processes, or simply gain new insights. The goal of your big data strategy and plan should be to find a way to leverage data for more predictable business outcomes. But you need to walk before you run. We recommend that you start by embarking on a discovery process. You need to get a handle on what data you already have, where it is, who owns and controls it, and how it is currently used. What are the third-party data sources that your company relies on? This process will give you a lot of insights.

For example, it will let you know how many data sources you have and how much overlap exists. This process will also help you to understand the gaps in knowledge about those sources. You might discover that lots of duplicate data exists in one area of the business and almost no data exists in another area. You might discover that you are dependent on third-party data that isn’t as accurate as it should be. Spend the time you need to do this discovery process because it will be the foundation for your planning and execution of your big data strategy.

Figure Out What Data You Don’t Have
Now that you have discovered what data you have, it is time to think about what is missing. Take advantage of the task force you have set up. Business leaders are your best source of information. These leaders will understand better than anyone else what is keeping them from making even better decisions.

When you start this process of determining what you need and what is missing, it is good to encourage people to think out of the box. For example, you might want to ask something like this: “If you could have any information at any speed to support the business and cost were no issue, what would you want?” This doesn’t mean that cost isn’t an issue. Rather, you are looking for management to think out of the box about what could really change the business.
With the innovation happening in the data space, some of these wild ideas and hopes are actually possible.

Understand the Technology Options
At this point, you understand your company’s goals, you have an understanding of what data you have, and you know what data is missing. But how do you take actions to execute your strategy? You have to know what technologies are available and how they might be able to assist your company to produce better outcomes.
Therefore, do your homework. Begin to understand the value of technologies such as Hadoop, streaming data offerings, and complex event-processing products. You should look at different types of databases such as in-memory databases, spatial databases, and so on. You should get familiar with the tools and techniques that are emerging as part of the big data
ecosystem. It is important that your team has enough of an understanding of the technology available to make well-informed choices.

Plan for Security in Context with Big Data
While companies always list security of data as one of the most important issues they need to manage, they are often unprepared for the complexities involved in managing data that is highly distributed and highly complex. In the early stages of big data analytics, the analyst will not secure the data, because only a small portion of that data will be saved for further analysis. However, when an analyst selects an amount of data that will be brought into the company, the data has to be secured against internal and external risk.

Some of this data will have private information that must be masked so that no one without authorization has access. For security to be effective in the context of big data, you need to have a well-defined plan.

Plan a Data Governance Strategy
Information governance is the ability to create an information resource that can be trusted by employees, partners, and customers. A governance strategy is the joint responsibility of IT and the business. It is key that concrete rules exist that dictate how big data will be governed. For example, rules exist that determine how data must be protected depending on the circumstance and governmental requirements. Healthcare data must be stored so that the identity and personal data remain private.
Financial markets have their own set of data governance requirements that have to be adhered to. Problems can develop when an analyst collects and analyzes huge volumes of information
and does not remember to implement the right governance to protect that data. In addition, data sources themselves may be proprietary. When these sources are used within an organization, restrictions may exist on how much data is used and for what purposes. Accountability for managing data in the right way is the heart of a good data governance strategy.

Plan for Data Stewardship
It is easy to fall into the trap of assuming that the results of data analytics are correct. Management likes numbers and likes to make decisions based on what the numbers say. But hazards can occur if the data isn’t managed in the right way. For example, you might be using data from five or six different data sources. In a situation where a company is determining which customers are
potentially the best targets for a new product offering, a company might want to analyze 10 or 15 different sources of data to come up with the results.
Do you have common metadata across these data sources? If not, is a process in place to vet the viability of that source to make sure that it is accurate and usable? Using data sources that are based on different metadata and different assumptions can send a company off on the wrong direction. So, be careful and make sure that when you collect data that might be meaningful that it can execute in a way that helps the company make the most informed and accurate decisions. This also means understanding how to integrate these new data sources with historical data systems, such as the data warehouse.

Continually Test Your Assumptions
You will begin to find that making use of new data sources and massive amounts of data that could never be processed in the past can help make your company much better at anticipating the future. You will be able to determine the best actions to take in near real time based on what your data tells you about a customer or a decision you need to make.
Even if you have all the processes in place to ensure that you have the right controls and the right metadata defined, it is still important to test continuously. What types of outcomes are you getting from your analysis? Do the results seem accurate? If you are getting results that seem hard to believe, it is important to evaluate outcomes.
After you have more accurate data, you will be able to achieve better and more accurate outcomes. However, in some cases, you may see a problem that wasn’t apparent. Therefore, don’t just assume that the data is always right. Test your assumptions and what you know about your business.

Study Best Practices and Leverage Patterns
As the big data market matures, companies will gain more experience with best practices or techniques that are successful in getting the right results. You can access best practices in several different ways. You can meet with peers who are investigating the ways to leverage big data to gain business results. You can also look to vendors and systems integrators who have codified
best practices into patterns that are available to customers. It is always better to find ways to learn from others rather than to repeat a mistake that someone else made and learned from. As the big data market begins to mature, you will be able to leverage many more codified best practices to
make your strategy and execution plan more successful.

Bu Blogda Ara

dipl.-Ing. Deniz Cengiz

Ten Big Data Best Practices

Yorumlar

Yorum Gönder

En çok okunanlar

Cloud Computing Reference Architecture: An Overview

Cloud Architecture

Teknolojik Altyapıdan Ne Anlıyoruz?

Run SAP İş Ortağı Programı, En İyi Çözüm Operasyonunu Nasıl Sağlar?

CLOUD COMPUTING – An Overview

Artırılmış Gerçeklik nedir ve hangi alanlarda kullanılıyor?

KÖRLER ÜLKESİNE KRAL OLMAK

BİG DATA MANAGEMENT

Blockchain, sözleşmelerin dijital koda yerleştirildiği ve şeffaf paylaşılan veri tabanlarına depolandığı, silinmesi, değiştirilmesi ve düzeltilmesinden korunan bir dünyayı hayal edebiliriz.

Blockchain dünyayı dönüştürüyor.