An Analysis Of Big Data And How It’s Managed
In our advanced technological age, data has become the word we hear all the time. Our data is collected by any number of companies and organizations, allowing these organizations to have access to the most secure or most menial information about our whole lives.
Facebook is now one of the biggest advertising platforms for businesses as it collects, stores and controls the most amount of data, both business and personal.
So, how is this data collected and organized? The sheer amount of it is mind-blowing for ordinary humans and it requires extraordinary capabilities to sort through it, categorize and store it. Data on this scaled is called “big data”.
According to an article by Alan Bridgwater in Forbes magazine, he describes Big Data as coming from many different sources and sizes up to terabytes to zettabytes.
This is where data modeling comes in. It is a system whereby large amounts of data can be stored in a clearly organized fashion and is easily accessed when needed.
This is extremely important as companies need to find exactly the right information when it’s needed. As in any information gathering facility, the speed of retrieval is crucial as delays cost money.
Big data modeling can also avoid storing data which is not useful or necessary as it can sort out and delete whatever is not required.
A key component of a good data modeling system is the relational model which was invented by Edgar F. Codd in the 1980’s and is now used in most database management systems.
The relational theory basically categorizes the relationship between pieces of data and sorts them in that way.
However, there are also differences in relational database management.
These are known as database and data warehouse.
A database is built for speed and is used for particular types of transactions, such as Online Transaction Processing (OLTP) and is used to store current information.
Is mainly used for large amounts of historical data which to provide information across all areas of the data, using Online Analytical-Processing (OLAP).
The following are all examples of data warehouse models.
ER modeling (Entity-Relationship Model)
This is a modeling methodology put forward by Bill Inmon, which is loosely based on the idea of categorizing information by themes, rather than actual data. This has proven very useful in the financial services area.
This is an opposite model from ER modeling as it takes the necessary processes of a business as the starting point and designs the model from that point first, adding other aspects later on.
This was devised by Ralph Kimball in his book “The Data-Warehouse Toolkit — The Complete-Guide to Dimensional-Modeling”.
Data Vault Model
This was designed by Dan Linstedt based on the ER model and it is useful for organizing data but not directly for decision-making.
It consists of different sections such as the hub, which is the core of the business. Then there are the links which connect the hub with the third part – the satellite, which separates the structural information of the company from other attributes which are less intrinsic to the business.
This is the brainchild of Lars Ronnbeck, who wanted to design a model that could be scaled by adding, without making modifications.
It has similarities with the Data Vault-Model.
Its major advantage is its scalability. Big data management has the capacity to change the way we live, even more so than we have already experienced.
According to Entrepreneur.com, the volume of data will reach 44 trillion gigabytes by 2020, more than ever seen before and creating a situation in business never before experienced.
We’ll have to find a new word for “big” then.