User Access Control Issues in a Data Warehouse

Filed Under: Coding Tips |    by: watch online

In a number of cases, information within the data warehouse may be subject to privacy laws, which restrict access to data. The extreme case of this is where there must be a complete Chinese wall between different segments of the data warehouse. This situation occurs in the financial sector, where legislative controls may restrict access to data relevant to that organization.

For example, if you are designing a data warehouse for a retail banking institution, check that all the accounts belong to the same legal entity. It may be the case that accounts are owned by a number of retail banks wholly owned by a holding bank. Privacy laws could force you to totally prevent access to information that is not owned by the specific bank. Clearly, we may not be able to utilize database facilities to apply this degree of access control. If we place all account transactions in a single fact table, we may not be able to provide the appropriate level of access control. The database may be able only to restrict access to specified tables per user, as opposed to specified rows per user.

Data marts allow us to build complete Chinese walls by physically separating data segments within the data warehouse. The detailed data can then be removed from the data warehouse in order to avoid possible privacy problems. Within the data warehouse, we can retain aggregations that have been created at a level of detail where privacy laws no longer apply. They can be generated from the detailed data loaded as part of the load process, by updating existing summary tables. Alternatively, the aggregations can be generated from each individual data mart, and stored centrally.

Identifying User Access Tool Requirements

Filed Under: Coding Tips |    by: watch online

Data marts are required in order to support any user access tools that require internal data structures. Data within those structures is outside the control of the data warehouse, but they need to be populated and updated on a regular basis. Some of these tools are often populated directly from source systems, on the basis that they are satisfying the existing requirements to analyze data. This may be true initially, but it is highly likely that additional requirements outside the scope of the tool are identified in future. In practice, this data should be populated from a data warehouse, not directly from the source data, in order to ensure consistency across all user access tools, even if they each have their own data mart. An additional access tool may be required, or a new requirement may exist to analyze detailed data that may not be held in the appropriate structures in the existing data mart. In order to avoid problems in the future, it is recommended that one adopts a data-marting strategy. User access tools may come and go, but the underlying information will be held in a form suitable for analysis. This could be populating a new user access tool, or simply making the detailed data available for other analysis.

Identifying Functional Splits in a Data Warehouse

Filed Under: Coding Tips |    by: watch online

Within this step, we must determine whether the business is structured in such a way as to benefit from functionally splitting the data. Look for departmental splits, and then determine whether the way in which the departments use information tends to be in isolation from the rest of the organization. For example, in a retail organization in which each merchant is responsible for maximizing the sales of a group of products. Their brief could be to maximize sales by ensuring adequate stock levels for a store, and sales levels that meet or exceed targets.

In practice, this means that the information in a data warehouse that would be of value is sales transactions on a daily level, to monitor actual sales; sales forecasts on a weekly basis; stock positions on a daily basis, to monitor stock levels and stock movements on a daily basis, to monitor supplier or shrinkage issues. All this information can form very substantial data volumes when, by the nature of the role, the merchant is not interested in products that he or she is not responsible for In this instance, we can consider data marting the subset of the data dealing with the product group of interest, because the merchant is extremely unlikely to query.

Identifying Entities to be Fact Tables

Filed Under: Coding Tips |    by: watch online

When we are presented with a large entity model, it can be difficult to determine which entities should become fact tables within the database. Making this determination is not always straight forward. However, because the query performance of the data warehouse will hinge on the correct identification of fact tables, it is very important to get this right. Always remember to look for the elemental transactions within the business process. This identifies entities that are candidates to be fact tables.

The use of Starflake Schemas

Filed Under: Coding Tips |    by: watch online

One of the key questions to be answered by the database designer is: How can we design a database that allows unknown queries to be performant? This question encapsulates the differences between designing for a data warehouse and designing for an OLTP system. In a data warehouse you design to support the business process rather than specific query requirements. In order to achieve this, you must understand the way in which the information within the data warehouse will be used.