What's The Word?!
What on earth is Big Data, a Data Lake and several other mind boggling tech terms you’ll want to be familiar with…
Spreadsheets are giving way to technology designed to deal with the mountains of (Giga, Tera and Peta Bytes) data companies are generating / accessing to gain deeper insights into their businesses and needed to remain competitive.
Whether you’re a seasoned IT professional or just now beginning to consider more advanced reporting tools you’ll want to become familiar with the lingo of the day. I say of the day primarily because new words / phrases are being added at a record pace – just look at Merriam Webster, heck, they added over 1000 new words in 2016 alone. The tech space is no different – did you know, not that long ago, that the definition of Computer was; ‘person who does computations’.
We’ve come a long way since then but the challenge remains the same, gain an understanding of what the buzz words du jour actually mean because many tech companies use terminology but do so rather loosely when describing what they’re delivering.
Let’s start with the ever popular, over and often misused moniker, Business Intelligence (BI) and its definition on Wikipedia:
Business IntelligenceBusiness Intelligence (BI) are the set of strategies, processes, applications, data, products, technologies and technical architectures which are used to support the collection, analysis, presentation and dissemination of business information. BI technologies provide historical, current and predictive views of business operations.
The key words in the definition that are essential when considering your next advanced reporting solution; collection, analysis, presentation and dissemination of business information. I’d add, collection and analysis of internal and external business information. Here at Mirus we expand on this definition and to us a true BI solution is one that comes with its own (Data Warehouse), the use of a (dimensional model), a robust Extraction, Transformation and Loading (ETL) process, and a set of On-Line Analytic Processing (OLAP) tools.
Now that we’ve covered the overarching term BI, let’s take a look at several other terms (some identified in our BI description above) that go along in support of Business Intelligence:
Data is a set of values consisting of qualitative or quantitative variables. Data is collected and analyzed; data only becomes information suitable for making decisions once it has been analyzed in some fashion.
A collection of data arranged for convenient and quick search and retrieval by applications and analytics software.
A small data repository that is focused on information for a specific subject area of the company, such as Sales, Finance, or Marketing.
A data warehouse is a repository that deals with multiple subject areas (or data marts).
A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. Unlike a data warehouse where the data structure is defined during the Extract-Transform-and Load ETL process the data structure and requirements of the Data Lake are not defined until the data is needed (extracted / read).
Extract-Transform-and Load, is the means by which data is pulled from a data source (extracted), (transformed), and written (loaded) into the destination database.
Online analytical processing (OLAP) is a high-level concept that describes a category of tools that aid in the analysis multi-dimensional queries.
OLAP came about because of the tremendous complexity and sheer growth associated with business data and as the volume and type of information became too heavy for adequate analysis through simple structured query language (SQL) queries.
A fact table in an OLAP database is essentially the subject of the analysis. The dimensions can be thought of as foreign keys to this table or attributes of the fact. For instance, you might have Sales as your fact table or subject. Attributes of a Sale could be things like date, time, customer, menu item, store name or number, geographical location, cashier, waitperson, etc. All these characteristics of the sale could be your dimension tables.
The structure that defines the organization of data in a database.
With a star schema for multidimensional modeling, the fact table sits in the middle of the diagram and all the dimensions hang directly off of the fact table thus resembling a star. Star schema provides for speed of retrieval and associating data across business domains.
Data that gives information about what the primary data is about (e.g., user metadata – defines who each user is, the security and the permissions for each user).
A data model is the result of a collaborative effort between end business users and IT database analysts. The first step is to define in plain English what data the business needs in order for its various functions to communicate with each other, and how this data must be ordered and structured so it makes the most sense. The second step is for the data analysts and other IT staff to devise a technical data base, data storage and security plan, and a plan that enables application and analytic report development using this data. Together, these processes result in a data model for the business. NOTE: This is easier said than done.
Numerical data stored on a fact table is a measure. For instance, regarding sales fact, we might have a measure for cost, gross profit, maybe even order fulfillment time. These measures can then be summarized for each level of analysis.
Dimensions can be thought of as perspectives on a subject. Let's say your subject of interest is sales. One way of analyzing sales is by time. How many Burgers did we sell last month? Last quarter? Last year? This is a time dimension.
Maybe you want to look at your sales data by geography. What were our sales in Dallas? Texas? How about the entire U.S.A.? This is a geography dimension.
Menu or menu item category is useful. What was my best selling menu item? Are there any menu items that aren't hitting their sales targets? This is a product dimension.
Having multiple perspectives or dimensions with which to view the subject is very powerful. This allows you to put them together and get answers to even more detailed questions. such as, what was the worst selling item in Texas last quarter?
Data within a dimension can often be represented through one or more hierarchies. These are parent – child relationships within a dimension. For instance, in a time dimension, a year is a parent to quarter, a quarter is a parent to month, and so on. These hierarchies then make it possible to drill up, down or across a dimension.
Drilling Up / Down /Across – allows a user to examine examines the detail behind a row of data.
Ad Hoc Query
The ability to create a custom reports, on demand, that provides actionable, fact-based answers to specific business questions.
A mechanism that includes or excludes specific data from reports based upon what the user decides to filter. For example:
Dimension Filters – are filters that allow you to examine intersections of dimensions, such as employee sales by Hour, or Sandwich Sales for Tuesday Lunches at the Drive-Thru.
Measure Filters – these filters allow you to identify exceptions to normal operating results, such as those employees with more than 3% of sales voided off the check.
Filtering returns a limited set of data or what’s described as an Exception-Based Report (EBR). By limiting the amount of information to be reviewed, EBR is a means to help management and staff more quickly locate opportunities and issues.
A method to organize and display your reports and graphs in a style that suits you.
Key performance indicator (KPI)
A KPI is a metric a business measures its progress against to determine whether it is meeting its goals. The results are displayed in a form (e.g., a traffic light – red, yellow, green), that enables executives, managers, and employees to easily and quickly assess performance, and whether a given goal (or metric) is being met, exceeded, or missed.
A method of putting data in a visual context as a way to assist users in better understanding what the data’s telling them.
Software as a Service
Last but not least is the acronym SaaS which when used by a company tells you they deliver all of the necessary, hardware, software, network infrastructure, security and reporting tools on a monthly subscription basis. In our humble and somewhat biased opinion SaaS is the least expensive method you can use to deploy and maintain a BI solution and shortest to implement. You can learn more about why we feel that way here, here and here.
There are many more terms used in the description of data analytics, in fact, far too many to cover here however, I hope that after reading this article you have enough of an idea of basic terminology to talk confidently with any vendor you might engage with.
What Are Your Thoughts?
Did we miss any terms you think are important?
Please share your stories, comments, and any other tips that may be helpful!
Mirus is a multi-unit restaurant reporting software used by operations, finance, IT, and marketing.
For more information, please visit: www.mirus.com
Watch Mirus reporting demonstrations and client insight on our YouTube Channel
If you enjoyed this blog, please share this post by using the social buttons at the top of the page and make sure to leave your thoughts in the comment section below!