In case you didn’t know, DbSchemaLibrary is officially launched a few days ago!
The project started last September. At the time, I thought it’ll only take 2-3 months to get to launch. The idea was very simple after all. Well fast forward to now, it took double that time.
There are good reasons why it took this long. I’ll be delving into some of the reasons in this article.
DbSchemaLibrary has a simple concept. The original plan was to collect various schemas, serve them on a website and be done with it. And sure enough, that by itself didn’t take very long.
There was a period when I tried to automate the extraction process. However, this proved to be more difficult than expected. So for launch, I settled on 100 or so schemas that I have already collected. This was a good base to work from.
However, it became clear to me very quickly that the schema by itself isn’t very useful. Public schema data is surprisingly not very accessible. So that’s a plus for DbSchemaLibrary. But understanding what is going on within a schema is a different beast of its own. If I want to use this library myself, I will have to analyze every schema that I see.
There are many things to analyze
So then I decided to include analysis in the product. This came in the form of various descriptions, modules, module patterns and module-table relationships.
Notice the different types of data that are included. Many useful things can be extracted from a schema. All of them are useful in different ways.
The temptation to push release until they are built was immense. I started working on modules, only to add module patterns, and finally the module-table relationships.
Even then, there are other metadatas like industries that I had to postpone to finally get this released.
Analysis requires a lot of domain knowledge
Coming from a computer science background, I have a fair amount of knowledge of databases. I also have experience dealing with a variety of data. Because of that, a good chunk of the analysis was quite straightforward.
But the range of topics in the field is quite vast. There are plenty of topics in which I had little to no experience. For example, Distributed systems.
Extracting something useful from those took a lot of research and reading. I had to be sure that the information is correct. After all, false information is worse than no information at all.
This process easily took the majority of the development time. Making sure the analysis was correct and useful to those using this resource.
Hiring for DbSchemaLibrary was tricky
Unfortunately, this meant that finding the right hire for DbSchemaLibrary was tricky. I needed someone who understands database and can write about it well. At the very least, they should be able to research fairly well.
It would also be hard for me to review their work. As I wouldn’t be too sure about the topic without researching either.
This issue combined with the lack of infrastructure (to add & manipulate the analysis data) made me decide to finish base of the product before pursuing this. There were too many uncertainty to make this work efficiently.
Now that the initial product is done, it will probably be much easier to add and extend the library. New modules should come faster. Existing ones will also become more complete as time passes. It should also be easier to find someone who would like to work on DbSchemaLibrary as there is a base to work from. If this is you, let me know!