Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial about how to structure data in a relational database system #283

Open
niconoe opened this issue Nov 16, 2021 · 2 comments
Open

Comments

@niconoe
Copy link
Contributor

niconoe commented Nov 16, 2021

I've had the chance in the last few years to work in collaboration with biodiversity professionals, and from my IT-oriented perspective there's often a lack of knowledge about how to efficiently organise / structure the data in a relational database systems. To be more precise: there's generally enough knowledge to make something "that works", but a better structured database would be much more future-proof (less data errors, easier to reuse the data in other contexts such as data publication, web portals, ...)

That tutorial would not cover SQL nor the technicalities of a given database engine (SQLite, PostgreSQL, ...), but rather help answering questions such as

  • should a given piece of information be placed in a new field or in a new table?
  • how should I link tables X, Y and Z so they can be queried to answer a wide range of questions?
  • what constraint can I configure early when I create a database so human errors (i.e. typos when entering data) are detected as early as possible (and the database doesn't get messier when it gets more used/bigger)

Is there any demand for this from scientists, or is it just me?

If so, I'd be happy to help contributing to a tutorial (but I think it can be a pretty large task, so I'd like to have an idea of the interest first).

@florisvdh
Copy link
Member

Hi Nico, this sounds great, and indeed seems very useful to many scientsts (INBO and non-INBO) - thanks so much for your enthusiasm!

IMHO it would be interesting that you connect with @fredericpiesschaert and @gertvanspaendonk in order to see which material they already have in that direction (it may be in Dutch, and SQL Server oriented, but still honouring the ideas that you describe). Ideally they could participate in this effort.

Moreover, this has relationships with issue #10 and the associated (still open) PR #140 (designing databases), for which further collaboration with @ThierryO was suggested in order to make it more technology-agnostic, similar to what you suggest.

Maybe @niconoe it would be great if you could bring the involved people together for a dedicated brainstorm and to refresh plans?

@florisvdh
Copy link
Member

Moreover, this has relationships with issue #10 and the associated (still open) PR #140 (designing databases), for which further collaboration with @ThierryO was suggested in order to make it more technology-agnostic, similar to what you suggest.

Update: PR #140 has been merged since the authors wish to limit the scope to MQ SQL Server. So there's still room to tackle issue #10 further.

Advice on how to structure data in a (any) RDBMS is most welcome indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants