26 thoughts on “… that beautiful moment when you run out of priv…”
Congrats, I just had a complete opposite experience =)
.. that awfull moment when you get this message from SQL server: “Too many table names in the query. The maximum allowable is 256.”
That’s the rule: if you run into a limitation in a development tool, it means you are doing something wrong.
But the surprising thing about this particular case is that the query is decent, it just has to aggregate financial data from a dozen databases. And the code is partially generated, with edit particular to each database.
And re-implementing it in a ‘cascade’ of views is just not as efficient.
ANY query can be split into several query … then any query can be generalized (hello to nosql with key=value storages) … then any expensive query can be moved out into the application logic … and for anything that is slow there is an option of more hardware :)
Databases are optimized to work with large amounts of data and do it efficiently, this is not always the case with applications. Running a complex query against a database may be much cheaper, faster and more efficient than fetching results of simpler queries into the application and processing this data.
Yeah, I would almost agree with you, except that pretty much EVERY single entity with a lot of data is RUNNING AWAY from the conventional database usage. Look at Google. Yeah, it does run MySQL, which is technically still counts like a database, but the way they use it is far from traditional. Facebook – the same. They do bits and pieces with MySQL, and yet they do a lot of filesystem stuff. NASA has recently ran away from Oracle to MySQL and a lot of filesystem. Flickr, if I remember correctly, jumped into NoSQL area.
Structured databases do have a certain niche, but they are usually not too scalable, not too flexible, and very, very expensive. Just a few years ago we had very few alternatives. Now – there is plenty.
But then again, I might be completely talking out of my ass. :)
That’s theory mate, practice is very different.
We have a client that works on and ERP system that was developed in the 90’s, and the client needs to extract high-level financial data from that system by the end of this month for a merger or acquisition or something like that (can’t go into details for obvious reasons)
The data is distributed accross a dozen databases, each containing millions of transactions.
Now to give you a feeling of how bad, and just INSANE, the ERP system is, here are a couple of examples:
– Dates in the system are stored in CHAR fields encoded in their own speacial format
– All tables in the system have an ’empty’ row, which has an id 0, this is done to avoid handling NULLs
– Referential integrity is maintained in the application layer
note: these are just harmles database quirks compared to the other crazy shit this system does
Migrating away from this sytem to something better will cost the client aproximately 0.25M EUR of just direct costs, disregarding the cost of business interruptions.
My method is to write and debug a query for one company, which is then automatically replicated for all the companies in question, and then edited for the needs of the particular company.
Putting everything into separate views causes DRASTIC performance degradation, mainly because putting indexes causes the system to fail.
Lev Blaivas yeah, I know those horrible cases … had a fair share of those myself. But than again, as avery Perl developer knows, There Is More Than One Way To Do It (TIMTOWTDI) :) .. not judging, just saying :)
Yeah, ugly, but not too complex. Sub-queries can be refactored out into an application layer. Get those out, cache the values, and use the values for simpler queries. Usually that run much faster, as application caching is smarter than the database one – especially with large queries the cache fills up just when you don’t want it too. :)
Sorry for deleteing the sample, but in hind sight it’s not really appropriate to publish it.
There is no application layer, 100% of the logic is done in SQL server.
Have to disagree with you on that, increasing the complexity of a solution is never a good thing.
Yes, it would give me flexibility, but would take longer to debug/setup/maintain. And time is of the essence.
I took me aprox 30 minutes to refactor that script into something that works just as fast, and doesn’t hit the limitations of SQL Server.
For example a couple of months ago I made a quick .net data mining application to extract some data from the system, and hit a limitation of the .net clr: an object cannot be more than 2GB irrespective of 32/64 bit or the hardware
And that was harder to refactor than an ugly but not too complex query.
Lev Blaivas Interesting. We obviously work with very different technology stacks, so approaches and solutions are very different. I wouldn’t even look into a large query that’s problematic – immediately break it into smaller, manageable, independent parts. So far, worked better and faster for me. But it’s interesting to see how other people approach similar problems differently.
If by braking down a large query into smaller ones you mean creating views, then the difference is definitely NOT the technology stack.
Nesting views makes it so much harder for the query optimizer to come up with an efficient query plan on SQL Server, MySQL and Oracle. http://dba.stackexchange.com/q.....ase-design
If this isn’t a problem for you, it means we are either solving very different problems, your hardware is divine, or you are a database God who understands query optimization better than an optimizer =)
Nope, I don’t mean views. I mean several separate queries, which your separately, sometimes in parallel, sometimes on-demand, with intelligent caching in the application layer.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
Congrats, I just had a complete opposite experience =)
.. that awfull moment when you get this message from SQL server: “Too many table names in the query. The maximum allowable is 256.”
Lev Blaivas there MUST be something seriously wrong with your query :)
That’s the rule: if you run into a limitation in a development tool, it means you are doing something wrong.
But the surprising thing about this particular case is that the query is decent, it just has to aggregate financial data from a dozen databases. And the code is partially generated, with edit particular to each database.
And re-implementing it in a ‘cascade’ of views is just not as efficient.
That’s exactly what everyone else who does something wrong says :) “Yeah, it’s wrong usually, but not in this case” … :)
ANY query can be split into several query … then any query can be generalized (hello to nosql with key=value storages) … then any expensive query can be moved out into the application logic … and for anything that is slow there is an option of more hardware :)
Databases are optimized to work with large amounts of data and do it efficiently, this is not always the case with applications. Running a complex query against a database may be much cheaper, faster and more efficient than fetching results of simpler queries into the application and processing this data.
Yeah, I would almost agree with you, except that pretty much EVERY single entity with a lot of data is RUNNING AWAY from the conventional database usage. Look at Google. Yeah, it does run MySQL, which is technically still counts like a database, but the way they use it is far from traditional. Facebook – the same. They do bits and pieces with MySQL, and yet they do a lot of filesystem stuff. NASA has recently ran away from Oracle to MySQL and a lot of filesystem. Flickr, if I remember correctly, jumped into NoSQL area.
Structured databases do have a certain niche, but they are usually not too scalable, not too flexible, and very, very expensive. Just a few years ago we had very few alternatives. Now – there is plenty.
But then again, I might be completely talking out of my ass. :)
That’s theory mate, practice is very different.
We have a client that works on and ERP system that was developed in the 90’s, and the client needs to extract high-level financial data from that system by the end of this month for a merger or acquisition or something like that (can’t go into details for obvious reasons)
The data is distributed accross a dozen databases, each containing millions of transactions.
Now to give you a feeling of how bad, and just INSANE, the ERP system is, here are a couple of examples:
– Dates in the system are stored in CHAR fields encoded in their own speacial format
– All tables in the system have an ’empty’ row, which has an id 0, this is done to avoid handling NULLs
– Referential integrity is maintained in the application layer
note: these are just harmles database quirks compared to the other crazy shit this system does
Migrating away from this sytem to something better will cost the client aproximately 0.25M EUR of just direct costs, disregarding the cost of business interruptions.
My method is to write and debug a query for one company, which is then automatically replicated for all the companies in question, and then edited for the needs of the particular company.
Putting everything into separate views causes DRASTIC performance degradation, mainly because putting indexes causes the system to fail.
Lev Blaivas yeah, I know those horrible cases … had a fair share of those myself. But than again, as avery Perl developer knows, There Is More Than One Way To Do It (TIMTOWTDI) :) .. not judging, just saying :)
Yeah, ugly, but not too complex. Sub-queries can be refactored out into an application layer. Get those out, cache the values, and use the values for simpler queries. Usually that run much faster, as application caching is smarter than the database one – especially with large queries the cache fills up just when you don’t want it too. :)
Sorry for deleteing the sample, but in hind sight it’s not really appropriate to publish it.
There is no application layer, 100% of the logic is done in SQL server.
There’s your first problem :)
You mean I should have made an application?
Yup. And by application I mean anything from a bash script upwards. Where you’d have way more control and flexibility.
Have to disagree with you on that, increasing the complexity of a solution is never a good thing.
Yes, it would give me flexibility, but would take longer to debug/setup/maintain. And time is of the essence.
I took me aprox 30 minutes to refactor that script into something that works just as fast, and doesn’t hit the limitations of SQL Server.
For example a couple of months ago I made a quick .net data mining application to extract some data from the system, and hit a limitation of the .net clr: an object cannot be more than 2GB irrespective of 32/64 bit or the hardware
And that was harder to refactor than an ugly but not too complex query.
Lev Blaivas Interesting. We obviously work with very different technology stacks, so approaches and solutions are very different. I wouldn’t even look into a large query that’s problematic – immediately break it into smaller, manageable, independent parts. So far, worked better and faster for me. But it’s interesting to see how other people approach similar problems differently.
If by braking down a large query into smaller ones you mean creating views, then the difference is definitely NOT the technology stack.
Nesting views makes it so much harder for the query optimizer to come up with an efficient query plan on SQL Server, MySQL and Oracle.
http://dba.stackexchange.com/q.....ase-design
If this isn’t a problem for you, it means we are either solving very different problems, your hardware is divine, or you are a database God who understands query optimization better than an optimizer =)
Nope, I don’t mean views. I mean several separate queries, which your separately, sometimes in parallel, sometimes on-demand, with intelligent caching in the application layer.
Aaaah, right, I was thinking very much inside my box =)
Happens to the best of us :)
I’d like to try that, what are you using for that?
Perl, usually. DBI and Cache modules, unless there some more specific need.
Thank you, I’ll give it a try this week.
Andrey Vystavkin liked this on Facebook.
Lev Blaivas liked this on Facebook.
Musti C. Boullra liked this on Facebook.