An approach to Database Organization
Ever walked into a new job and find that you can’t make sense of any of the databases and how they all hang together? The answer to that question is probably YES (a million times YESSS)!
This is where proper database organization comes in.
So, let’s go into why this is important, what we can do prevent this for our future selves, and how to implement this.
WHY?
I think it’s pretty evident that having a messy database is not only stressful to maintain and work on, but it also creates a lot of risk for the business. At the end of the day, no one wants production databases to go down.
Whether it’s one database or multiple interconnected webs of databases, it’s always important to follow clean database organization practices. This allows you to future proof yourself and makes it easier for developers and DBA’s to work.
Some of the benefits of having a clean structure and standards in place are –
- Reduced Risk: Reduce the amount of risk to production systems by knowing how things are structured and how any sort of changes will affect the production environment.
- Faster technical debt implementation: Reduce the time and effort needed for technical debt. Following a structure will give you a higher level of visibility on what changes to make. When starting a technical debt task, it will be easier to identify what will be affected and then fix it.
- Database as code (CICD): A good structure will act as the first step to setting up continuous integration and development for your databases. You can find more about it here.
- Faster on-boarding: Makes the on-boarding process for new team members easy and less painful.
WHAT?
There are 2 parts to this –
- Consistent metadata layer.
- Object placement in multi-database environment.
Having a strong metadata layer enables you to modularize your database objects. For example, if you’re using a view sitting on top of a table and all processes use that view, you’d able to make changes to the underlying table without affecting any of your processes. If you delete or add a column to the underlying table, it won’t break the view or anything using that view. Not only that, if one of the dropped columns is being used by users, you can simply edit the view to pass in blank values for the dropped column. This way, end users can have more time to make changes to their processes and the changes done on your end are minimal. This flexibility allows you to make changes to your database without major downtime and empowers you to plan for technical debt remediation.
You can also leverage the schemas in SQL Server to organize your database objects. For example – for certain self-servicing power users, you can create a schema “ss” (for self-service). This will help you define better role based access on objects and easily work with end users in a controlled environment in case there are any problems.
In regards to object placement, it is important to be careful where you place database objects. For example, if a stored procedure is calling a table from database B, it should be created in database B and not in database A.
HOW?
There are various ways to organize databases. Keeping in mind the two focus areas mentioned above, I will break this down further and suggest one way below –
### DATABASE METADATA LAYER
A basic metadata layer can be created by using views on top of tables. Anyone who wants data from the underlying tables need to query the view. In the above example, if you have a three tier data warehouse, you can implement the loading of data from staging to ODS to the data mart layer.
If is there a change in any of the source tables (specially removing columns from the underlying table), the changes can be implemented faster by tweaking the views. This can be done without spending hours to fix each and every layer.
DEPENDENCIES
Based on the metadata layer structure, you can now clearly see how the dependencies between the databases work. Database A has no dependencies whereas database B has a dependency on database A because it’s doing a cross database join. The same goes for database C, which is dependent on database B.
CICD BUILD SEQUENCE
Once the metadata layer and dependencies have been implemented, you can now very easily implement CI/CD and setup the order of build and deploy. This will also enable you to consolidate all these databases into one database solution.