Data, information and finally knowledge – what’s the worth of it in contemporary world? Seems that currently lack of them can be for a business as disasters (epidemic or war) for the world. Over time business is changing to be more driven by data-based decisions and to unleash the power of data. 


Big data and analytics are not a new concept, but it gives new ways of thinking and finding value. It simply can “change lives, transform industries, accelerate human potential, personalize products, inspire and improve collaboration10. In other words, Big Data and Analytics give power to gain competitive advantage on the market as well as help our lives to be better. 

What do those catchwords mean? 

The term Big Data can be introduced as a “massive data files that have a large, diverse and complex structure and are therefore difficult to store, analyze and visualize for other processes and results1 or described as “large-volume, high-speed and widely diverse information assets that require new forms of processing to allow improved decision making, greater insight into the discovery and optimization of processes2. This data can be characterized by high speed of generation, huge volume, variety of typologies and degree of veracity. Big data is then related to data management approaches as well as technology that is utilized for it. Approaches can be divided into two basic – data warehouses and data lakes3 . Pentaho CTO James Dixon describes “data mart (a subset of a data warehouse) as akin to a bottle of water… cleansed, packaged and structured for easy consumption, while a data lake is more like a body of water in its natural state. Data flows from the streams (the source systems) to the lake. Users have access to the lake to examine, take samples or dive in3. Currently, big data idea is mostly associated with data lakes approach, which have become synonymous with technologies like Hadoop4. If you think about big data technologies these are several open source software and tools (mostly provided by Apache Hadoop Foundation) as well as NoSQL databases, business intelligence tools, development tools and much more. Nevertheless, possessing technology is not yet a source of competitive advantage for companies on the market. 

As already mentioned, big data can be depicted as vast volumes and types of information that businesses can now collect and process using high tech systems. This comes from both internal and external sources. Data can come from broadly understood market and can be structured (e.g. market data from retailed stores) or unstructured (e.g. Internet search results). Data can be both generated by human or machines. Though, all data needs to be put into structured formats to be analyzed5 with the usage of maths and statistics aiming at deriving meaning from it. This meaning is to drive business decisions. This is called analytics.

Analytics “term is most often used to describe the analysis of large volumes of data and/or high-velocity data, which presents unique computational and data-handling challenges6.

Of course, analytics is not only utilized for big data – this is general term handling with historical data analysis but having in mind volumes of data are growing this is mostly associated with this concept. There are four kinds of analytics7:

  • Descriptive – “what has happened?” – tells you what happened in the past thanks to data aggregation and data mining, but it won’t tell you why and what might change. Describes what is going on within the company and summarizes different aspects of the business. Good examples are dashboards and scorecards;
  • Diagnostic – “why something happened?” – drilling down to find out dependencies and patterns to identify root cause of the problem e.g. why the revenue of sales is falling in certain segment – drill sales down into subcategories to find out8.
  • Predictive – “what could happen?” – model future outcomes based on forecasts and statistical models. Gives understanding of the future with certain level of confidence (unfortunately never 100%). Can be utilized to predict how customers will respond to the marketing campaign, how sales will be affected by certain market conditions or how to model sales plans for next year forecasting purchasing patterns etc. In finance, example is producing a credit score.
  • Prescriptive – “what should we do?” – advices on how best action based on optimization, simulation techniques (e.g. A vs. B testing). This is an attempt to quantify future decisions to guide on future business actions. It’s utilized for optimizing production, scheduling inventory to improve customer satisfaction, help to offer right discounts or use proper adds on the web. This is the most complex one. 

All mention points depict that pure big data storage, or its processing alone does not bring the business value. Analytics without data architecture, governance, validation and quality does not bring the business value neither. Only if you combine both concepts and you will support them with proper technology you will get the key competitive advantage for organizations today. 

How this can really work in business world? 

Nowadays, field of big data and analytics is still growing driven by demand from the market. “Experts predict the amount of data generated annually to increase 4300% by 2020 and enterprises are responsible for storing 80% of this data”9. Daily 2.5 trillion bytes of information are generated by people using mobile devices or bank cards, what’s more over 90% of data in the world has been generated only during last 2 years10.

Undoubtedly, companies to gain competitive advantage from this concept need systems and technical solutions to manage data. Along with that they require people with proper skill sets – technical and analytics. Those with technical skills – developers – with strong programming background are to build big data applications using available technology, meaning: Python, Java, JavaScript, Machine Learning, Shell scripting, and data visualization tools11 etc. Those, capable of manipulating data queries and translating data into valuable information data scientists, big data specialists and data analysts12 are indispensable as well in this model. What’s more, important factor to effectively utilize big data and analytics is accessibility of clean unique data for the whole company, then leaders promoting data and analytics culture and last but not least targets for key business areas that can benefit most from the approach. Those factors are driving companies to best business decisions.

Researches show that most of the enterprises use the concept to decrease expenses, find innovation avenues and launch new products and services (Diagram 1). 

Diagram 1. How Fortune 1000 Executives Report Using Big Data
Source: R. Bean, How companies say they are using big data, HBR, 2017,
https://hbr.org/2017/04/how-companies-say-theyre-using-big-data 

Randy Bean, in his article: How companies say they are using big data, is emphasizing that big data is already being utilized to improve operational efficiency, and the ability to make informed decisions based on the very latest up-to-the-moment information. The next phase will be to use data for new products and other innovations. About half of the executives surveyed predict major disruption on the horizon, as big data continues to change how businesses operate and compete13

Sceptics claim that this is another way to spend money for technology, but the idea of elevating big data and analytics is to put it in the center of what company is doing and really utilize it fully. Take advantage of the data, find the possibilities to do it, find best tools to do it – that’s what is rapidly becoming the mainstream norm.

Disruptive innovation projects’ characteristics 

Hadoop was used first in 2004 and originally, was thought to not need project management because it was so easy to get it together, up and running. Today, some clients process petabytes of data per day globally and transactions reach terabytes per second. Implementations in the field of big data analytics can involve 50-100 engineers, which is covering only technical part of the project14. What kind of projects you can expect within big data world? Are they going to be IT related initiatives only? No, big data and analytics projects should be business decisions and they should not be just IT concepts. They have a technological aspect, but they are implemented to unleash business and its potential. 

Two main types of big data and analytics projects are15

  1. software development – big data platform or big data analytics,
  2. big data hardware infrastructure projects. 

Within those types there is vast variety of different initiatives, which can be more holistic, when company wants to implement the concept or touching certain parts of it if organization has big data and analytics culture already instilled. Table 1 and Diagram 2 depict examples of such projects and shows dimensions of big data and analytics solutions. Considering big data and analytics projects its worth to mention that “data analytics processes and technologies can be treated as disruptive innovation on the market, creating new value network and market or disrupting an existing market and value network by displacing the leading, highly established alliances, products and firms16. Projects in such areas require from Project Manager some technical expertise as well as business sense which are both essential competencies. Expert judgement in working with technologies and ability of scaling efforts between certain activities within the project are also key. 

Table 1. Big Data and Analytics Projects Examples
Diagram 2. Big Data Analytics Solutions Dimensions Source: T. Crawford,
Big data Analytics Solutions Dimensions – Big Data Analytics Project Management: Methodologies, Caveats and Considerations, PMI Presentation

Tiffany Crawford as well as Ray ZhanSu considering project management in big data and analytics area are pointing out certain characteristics of those projects14,15. Such projects have bumpy project lifecycles – highly iterative, exploratory, time consuming in terms of testing and additional support required. Typical lifecycle can be described by phases: Initiate, Plan, Execute, Close, which are short for a series of smaller cycles. What’s more, those projects are utilizing much more tools and platforms, e.g.: use models, Proof of Concept, pilots for minimization of the risk and exploration of the different options. They have usually bigger scope than standard ones and it’s hard to control, keep awareness and efficient communication level as well as proper level of planning for this kind of initiatives. Moreover, stakeholders need to be educated so that they have proper expectations in terms of changeability and flexibility. If you think about needs of such projects they require collaboration over negotiation, facilitation and strong governance but on the other hand project management processes must be lightweight and facilitative. We need to bear in mind that those initiatives bring new knowledge which is then becoming part of organizational process assets. On top of it risk management must be more extensive and based on the big data analytics expertise, that’s why qualifications of the project team are key – high qualified people are required, which is still a challenge nowadays. Those projects need as well to consider multiple QA processes including quality of software code, data, modelling, which is adding more complexity. 

In big data and analytics initiatives methodology which is mostly utilized is Agile, especially because of iterative discovery process. All in all, hybrids are also possible. There are currently companies that are creating own methodologies to support such projects and partner effectively the business e.g. Steps Data Analytics Methodology based on 6 steps: Discovery, Design, Construct, Test, Deploy, Run/Maintain17

It is believed that the key in the project management in big data and analytics initiatives are robust business case and proof of concept. Big data projects typically require a lot of effort to be spent on data extraction, cleansing and integration. At the same time, data volumes involved are inherently large, and therefore it may be best to define a Proof of Concept (POC) project as a first step. POC needs to be clearly communicated to all stakeholders as starting with only best guess estimates of time, cost and the probability of success. A proof of concept project executed over a few agile iterations will help uncover many unknowns, not only in terms of project outcomes, but also in terms of a clearer understanding of what processes work better, how much effort they need, and so on15

Building POC should consider 3 main steps

  1. lab setup – setup of POC environment (Big data labs),
  2. use case execution – building of use case application (value accelerators),
  3. value demonstration – demonstration of value accelerators included in the business case.

Once go decision granted then implementation phase can be kicked off and deployment of the solutions. 

Steps to projects success 

Various sources report that 65-100% of Big Data Analytics projects fail due to incompleteness, being out of the schedule, being over the budget – here mostly because of POCs and pilots estimates14.

IKANOW organization is pointing out 8 proven steps to start a big data analytics projects worth to consider to be successful18

  1. Problem – define what issues you would like to address, solve within the organization. 
  2. Impact of the problems – understand what’s the influence of the issues you’d like to address to your business, then create use cases. 
  3. Success criteria – define metrics to be used to measure the success of the project. 
  4. Value & impact – think about the impact of the elimination of the problem for your business. This should result in proper business case definition and budget estimation. 
  5. Cloud or on-premise or hybrid solutions – decide where the solution should live. 
  6. Data requirements – assess data requirements – what data you need, where you will find them etc. 
  7. Identify gaps – validate capabilities and capacity – do you need vendors, do you have staff with required knowledge and skillset, do you have hardware and software to start on? 
  8. Agile or iterative approach – start with pilot, set goals and milestones, break up into manageable work packages, evaluate value of pilot and move to production. 

To build on those rules it is worth to mention that it’s better to start small, i.e. break down big data analytics into smaller components. Then build on what was done and grow building layer by layer – it’s not possible to know everything in advance in this kind of projects. This approach is reducing risk, enabling to see value quicker with iterative success. Jonathan Buckley writing about practices for big data project management is adding that there is an immediate need to begin with clear objectives – organizations must ask the question: “why do we want to analyze our data in first place?”. Moreover, it’s important to remember that this is business that drives projects, not IT, which implies indispensable buy in from all business stakeholders19. One more aspect of successful projects in big data and analytics is change management. This is as well key for projects that are prescriptive in nature. Analytics initiatives often involve automated systems that tell customer facing or front-line workers the optimal path or decision to make. Applications that involve changes to workflow affect business process and the people who perform them directly. Therefore, change management is a base. Otherwise the analytics efforts will be at risk for creating unhelpful disruptions or a waste of time20.

Summary

Big data and analytics are still growing concepts that are supporting business in making decisions. It leads to optimization, growth and innovation on the market. Nevertheless, seems that big data is disruptive force, which means that people require more than new skills, technologies and tools. What’s indispensable is also open-minded approach, processes and transformation mind-set. This force in conjunction with analytics and people’s new skills can be very powerful in the future but still seems to be in the phase of growth21

Following Rob Phelps’ conclusions, the way big data projects are organized and managed is as important as the technology and data sets involved. “Big Data projects however cannot be fully successful if constrained by traditional methods of implementing IT systems. Big Data projects do not have full answers when they start but focus on finding them through evidence-based discovery. This discovery is best accomplished when viewed with overall organizational goals and aspired outcomes in mind that an evaluation logical model can provide20

All in all, companies still learn how to utilize power of big data, how to manage projects implementing disruptive innovations and build big data strategies to meet the needs of the market proactively. 


Sources: 

  1. S. Sagiroglu; D. Sinanc, Big data: A review. In: Collaboration Technologies and Systems (CTS), International Conference on. IEEE, 2013. p. 42-47
  2. MA Beyer; D. Laney, The importance of ‘big data’: a definition, Stamford, CT: Gartner, 2012
  3. C. Campbell, Top five differences between data lakes and data warehouses, 2015 https://www.blue-granite.com/blog/bid/402596/top-five-differences-between-data-lakes-and-data-warehouses
  4. Techopedia.com, Datalake, https://www.techopedia.com/definition/30172/data-lake
  5. Big Data Analytics – Big Data Project Management Career – HBR, https://www.youtube.com/watch?v=WngnAyhQ7bc&feature=youtu.be
  6. Informatica.com, Data analytics, https://www.informatica.com/services-and-training/glossary-of-terms/data-analytics-definition.html#fbid=QzgZ8PlHM0K
  7. Halobi.com, Descriptive, Predictive and prescriptive analytics explainedhttps://halobi.com/blog/descriptive-predictive-and-prescriptive-analytics-explained/
  8. ScienceSoft.com, 4 types of data analytics to improve decision-making, https://www.scnsoft.com/blog/4-types-of-data-analytics
  9. Seagate.com, Big Data Universe Beginning to Explode, CSC, 2012. https://www.seagate.com/pl/pl/tech-insights/big-data-analytics-master-ti/
  10. What is Big Data Analytics: https://www.youtube.com/watch?v=aeHqYLgZP84 
  11. N. Rahman, Big Data Analytics for a Sustained Competitive Advantage, Students Research Symposium, 2017 https://pdxscholar.library.pdx.edu/cgi/viewcontent.cgi?article=1116&context=studentsymposium
  12. A. Monnappa, Data Science vs. Big Data vs. Data Analytics https://www.simplilearn.com/data-science-vs-big-data-vs-data-analytics-article
  13. R. Bean, How companies say they are using big data, HBR, 2017 https://hbr.org/2017/04/how-companies-say-theyre-using-big-data
  14. T. Crawford, Big data Analytics Solutions Dimensions – Big Data Analytics Project Management: Methodologies, Caveats and Considerations, PMI Presentation
  15. R. ZhanSu, Big Data Project Management, http://www.scienteconsulting.com/blogs/803/project-management-methodologies-big-data-analytics/
  16. CM Christensen – https://en.wikipedia.org/wiki/Disruptive_innovation#cite_note-2
  17. www.siwel.com
  18. IKANOW.com, 8 Proven steps to starting a big data analytics project, http://www.ikanow.com/8-proven-steps-to-starting-a-big-data-analytics-project/
  19. J. Buckley, 5 Best Practices for Big Data Project Managementhttps://www.qubole.com/blog/big-data-project-management/
  20. R. Phelps, Tools for Managing and Measuring the Value of Big Data Projects, 2014, https://analytics.ncsu.edu/sesug/2014/PSA-07.pdf 
  21. S. Ravindra, Big Data’s potential for disruptive innovation, July 2017, http://dataconomy.com/2017/07/big-data-disruptive-innovation/