SQLView has various important features.
- The syntactic view of a user query
- Operating without a given data set
- Providing a generalised visualisation of what a query means
Here we illustrate these with a variety of increasingly complex examples.
A simple equality and query and what SQL shows you.
The query is:
SELECT numberofkids FROM staff WHERE numberofkids = 2
“Select the numberofkids value from a staff database, specifically restricting what is output to those staff with exactly 2 children.”
This not only shows the query entered (along the bottom), it also shows the syntactic elements that form the query. At the top the round node represents everything below it (so the entire query).
Mousing over the round node shows what the query means.
The coloured box (called a BRIQ) is designed to look a bit a like a database table (the output of the query). In this case it shows that the output is going to be a staff table with one column “number of kids” and the values returned with all have 2 as their value. (Not that exciting!)
The green hue indicates this is what is output (think red -> amber -> green). Notice the red element shows “NULL” (an odd SQL token representing no result!). Being red this means there will be no NULLs given in the response.
If we move down the tree structure we’ll see that lower BRIQs show those computed results that contribute to the BRIQs shown higher up. (This example is not that interesting, it does show that the BRIQs where the response is not defined are coloured amber – ready to be used!)
And so on down to the equality central to the query.
Other parts of the structure show what they contribute for example, below we can see that the “FROM” part of the query only shows that the table “staff” has a role in the computed result.
Now let’s get a more interesting case:
The query is:
SELECT age FROM staff WHERE numberofkids = 2
“Select the age value from a staff database, specifically restricting what is output to those staff with exactly 2 children.”
The structure is not that different, though notice selected result “age” is different.
Mousing over the round node shows:
Show the output will be staff.age’s and the “…” means any staff ages, however the constraint is shown by the other column. While ages are output, it is only the ages of staff with number of kids equalling 2.
Moving down the tree we the parts of the computation shown, as before.
Now to get more interesting!
So far we have been considering a very simple constraint. Normally they are far more complicated. Let’s try and find the age of staff with 2 children, earning 30000 and with only one car.
Unlike common tools to play with a query like this, you’d need to build an example database with a good range of different individuals in it to show the effect of what you are illustrating. What it more, for students to understand query responses, they’d have to understand the data they are querying (in order to validate the responses!)
So, let’s try:
SELECT age FROM staff WHERE numberofkids = 2 AND salary = 30000 and cars = 1
The result shows the three conditions and the required output.
As we move down the tree we can see each condition involved:
And their composition in higher nodes:
Still not too complex, now let’s try and negate the entire condition (finding the age of people who do not have two children, do not earn 30000 and do not have one car).
We can do this by putting a not around the WHERE condition. (Notice that we have to use brackets)
SELECT age FROM staff WHERE not( numberofkids = 2 AND salary = 30000 and cars = 1 )
The visualisation looks fine, but the BRIQ at the top is quite different now.
Well the red section (the things excluded) is now more extensive, it shows all the permutations of values excluded from the results. The results that are not excluded are ones where there may be NULLs. So an age may be shown for someone for whom we have no record of their salary and no record of their car, provided they do have a number of kids recorded and it is not 2.
Moving to lower nodes down the tree you can see the BRIQs show the conditions before the not changes the answer so much.
Another are of confusion and requiring support is expressions where there are combined alternatives. Humans are great a seeing what’s intended and computers are not. So, imagine staff days sick were recorded and we had data on how far they commuted and whether they benefit from the cycle to work scheme.
Imagine, we are interested in finding those who have been off ill over a month or cycle to work and commute more than 50 miles. Something like this:
SELECT name FROM staff WHERE daysoff = 30 OR cycle=1 AND commute=50
The structure shows the potential issue with queries like this, did we mean people who commute 50 and cycle and have been off, or did we mean people who cycle and commute 50 and also those who have had days off. There is room for confusion … there a precedence between “and” and “or” and we also use our common sense. (People are unlikely to cycle 50 miles for work!)
However, the tree shows what SQL will do with the problem. It does assume people will cycle commuting 50 miles!
Mousing over exposes the SQL logic:
For a learner to appreciate the correctness or otherwise of such a query is very hard to do. The closest tutors may get is composing their own data set covering every possible case and letting users find if the query response is as intended.