This website includes Education Information like a programming language, job interview question, general knowledge.mathematics

Education log

PageNavi Results No.

Ads

Friday, May 1, 2020

what is data warehouse


Introduction data warehouse

In this, an article today learn data warehouse introduction and in this program, I'm just going to give you a quick overview quick summary of what data warehousing is and why it's used so let's get started first of all I'd like to introduce you to the two people who got data warehousing going the first is a chap called bill Inmon born in 1945 and got his degree from Yale before moving on to New Mexico University to get his master's in computing and he had an idea he'd seen that all databases. for more information visit Wikipedia page: Introduction data warehouse


 

to be proper databases had to be in third normal form but he had an idea maybe in some situations using third normal form wouldn't necessarily be best so he wrote a paper about denormalizing data in some situations and this was taken up by dr. ralph Kimball born in 1944 with a doctorate from Stanford and he commercialized the idea of data warehouses so let's start at the beginning data warehousing is not in third normal form hence its nickname Big Data because anything that's not normalized will also be much bigger than a fully normalized database the database.

 

 should be denormalized in a particular way into a second normal form, we're going to look at how that works in general in the few slides further on this means that you get data redundancy and that means that you need more storage and that means that you can get at the data more quickly and more quickly is what data warehousing is all about the purpose of a data warehouse is to provide aggregate data totals of reduce trends and so on which is in a suitable format for decision-making the decision-maker isn't interested in whether this particular customer has paid their bill or whether that  particular wine was bought on that particular the day they're interested in the general trends so they can see where the efforts of the company.

 

 can be best used so how'd you go about creating a data warehouse well let's go through this diagram and start on the left you have the operational systems within an organization marketing sales and so on and each of these will have their own properly normalized database you don't want to miss with the operational systems that are already functioning so what you do is you take copies of their information and put it into a staging area and what you do with the data in this staging area

 

well that's an entirely separate piece of work so you're not messing with the day-to-day running of the company in the staging area you can look at creating the data in a particular way I'll show you this in just a moment and you can make sure that all the data is in the same format and you think well it's coming from databases surely it's going to be in the same format well not necessarily in some departments the names of customers might be all in one field in others you have four names surname and title you might have data in pounds in one table and in dollars in another so we need to work out how we're going to get all of this into one logical.

 

 

 the framework goes through the integration layer and flows into the data warehouse in a format that is standard across all the data in that data warehouse so, for example, we would use all pounds or all dollars or have every name as for name surname and title the data warehouse will then be huge we've taken the data from across the organization and put it all into one large database in second normal form but we're going to have to answer questions for specific individuals the director of marketing will have a set of questions Director of Finance will have a set of questions the managing the director will have a set of questions so for them we create their own data mart the data Mart being much smaller than the data warehouse will give answers a little more quickly again it's not having to trawl through all the data but also, it means that the data Mart's are separate.

 

 entities what one user does with their data might is not going to affect what another user does so we have these data Mart's and strategic Mart's which are just the data Mart's but in a global overview for the company the sort of things that the managing director and their people would be interested in ETL and data Mart's ETL stands for extraction.

 

 transformation and loading and these are the three stages you have to go through to create a data warehouse first of all you have to extract the data from the in-house databases so that's getting the data from the original database into the staging area when it's in the staging area we've got to transform it to make it useful so we have to get all the data into the same format pounds or names surnames and titles and then we have to load it into the data warehouse itself once we've got it into the data warehouse we can then worry about the data Mart's these are the subsets of the data warehouse so we have these subsets so that people don't mess with each other's data also it keeps it simpler for the user.

 in this case, the marketing director is not going to be interested in what the finance director is interested in so let's keep their queries separate it makes it simpler there are fewer options for each person to look at and therefore less chance of them getting lost the other big advantage is that if one of them comes up with another question we're trying to solve that problem on a smaller set of data so it's going to be quicker and easier to solve where you've got the general idea of a data warehouse now let's move on to some of the specifics how is the data held and is held as a star just like you in the middle you have a fact and these this fact is linked to a number of dimensions here I've got four but you could have three or twenty-three so for example.

 

 you might have sales is the item in the middle so for each sale you have the member ID wine ID area ID and time ID time is needed so you get the idea of trends and you have that for every sale now that may not be applicable for example one member may have bought many things but this is in a second normal form so even though one member has bought say three items that members details are now going to be recorded against each sale for each of those sales and then finally we could look at it as if it were as a constellation we have the sales and then we can link the different dimensions together with other facts so, for example, the wine the member or the area and then then we've got a proper data warehouse we can make the constellation as complicated as we want you.



What is Data Warehousing?

Data Warehousing (DW) is processed for collecting and managing data from varied sources to provide meaningful business insights. A Data warehouse is typically used to connect and analyze business data from heterogeneous sources. The data warehouse is the core of the BI system which is built for data analysis and reporting.

 


No comments:

Post a Comment