|
ms
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Are Linq-SQL methods commutativeAre Linq-SQL methods commutative ?
Should the following queries return same or different results ? var query = query.Skip(n).Take(m); var query = query.Take(m).Skip(n); Andrus. Andrus,
Absolutely different results. Assuming the ordering is the same on each of them (because Skip and Take make no sense without ordering, LINQ to SQL will create an order for you, which irritates me to no end, but that's a separate thread), they will produce different results. Say your query will produce the ordered set {1, 2, 3}. Let n = 1, m = 2. The first query: var query = query.Skip(n).Take(m); Will return the ordered set {2, 3}, while the second query: var query = query.Take(m).Skip(n); Will return the ordered set {2}. The reason for this is that in the first query, the Skip method skips one element, then takes the remaining two, while in the second query, the first two elements are taken, and then the first one is skipped. -- Show quote- Nicholas Paldino [.NET/C# MVP] - mvp@spam.guard.caspershouse.com "Andrus" <kobrule***@hot.ee> wrote in message news:%234a%23iY2LIHA.5140@TK2MSFTNGP05.phx.gbl... > Are Linq-SQL methods commutative ? > Should the following queries return same or different results ? > > var query = query.Skip(n).Take(m); > > var query = query.Take(m).Skip(n); > > Andrus. > Nicholas Paldino [.NET/C# MVP] wrote:
> Andrus, Now that's unexpected. In database land, you can't get 2 different> > Absolutely different results. results. > Assuming the ordering is the same on each of them (because Skip and Why? SELECT * FROM Table without ordering has no defined ordering, so> Take make no sense without ordering, LINQ to SQL will create an order > for you, which irritates me to no end, but that's a separate thread), using a limit + skip operator on that set results in a set of undefined rows, you'll never know what rows will be returned. Show quote > they will produce different results. I tried it out to be sure, and indeed Linq to Sql generates two> > Say your query will produce the ordered set {1, 2, 3}. Let n = 1, > m = 2. > > The first query: > > var query = query.Skip(n).Take(m); > > Will return the ordered set {2, 3}, while the second query: > > var query = query.Take(m).Skip(n); > > Will return the ordered set {2}. > > The reason for this is that in the first query, the Skip method > skips one element, then takes the remaining two, while in the second > query, the first two elements are taken, and then the first one is > skipped. different queries (which really hurt my eyes but that's another story). The thing is though that on databases, people will use the take/skip combination to page through a bigger resultset. However consulting the manual for both shows no word whatsoever about the order in which these two statements have to be specified. I do understand the order, but it's a bit strange as well. For example, Linq apparently has no problem with an Order by placed in front of a where, however there IS an order in take/skip which are used combined as a paging mechanism. Of course, this follows from the specs of both, but semantically, the intention of what people want to do, e.g. to page, shouldn't require an order in the statements for paging, if other elements in the query also don't really require an order (they do, but that's whiped under the rug) FB -- ------------------------------------------------------------------------ Lead developer of LLBLGen Pro, the productive O/R mapper for .NET LLBLGen Pro website: http://www.llblgen.com My .NET blog: http://weblogs.asp.net/fbouma Microsoft MVP (C#) ------------------------------------------------------------------------ On Nov 26, 9:04 am, "Frans Bouma [C# MVP]"
<perseus.usenetNOS***@xs4all.nl> wrote: <snip> > I tried it out to be sure, and indeed Linq to Sql generates two I don't see why there'd be any explicit *need* for documentation on> different queries (which really hurt my eyes but that's another story). > The thing is though that on databases, people will use the take/skip > combination to page through a bigger resultset. However consulting the > manual for both shows no word whatsoever about the order in which these > two statements have to be specified. this - it seems pretty obvious to me. If you want to skip some results and then take a certain number of the rest, you do Skip and then Take. If you want to take some results and then skip within that (which is a slightly odd thing to do, admittedly) you do Skip and then Take. How is it not just intuitive? > I do understand the order, but it's a bit strange as well. For Filtering and ordering are effectively orthogonal, unless you can> example, Linq apparently has no problem with an Order by placed in > front of a where, however there IS an order in take/skip which are used > combined as a paging mechanism. build a filter which depends on the order. Skipping and taking are clearly *not* orthogonal. > Of course, this follows from the specs Elements in the query which interact are affected by the order in> of both, but semantically, the intention of what people want to do, > e.g. to page, shouldn't require an order in the statements for paging, > if other elements in the query also don't really require an order (they > do, but that's whiped under the rug) which they're specified. Elements in the query which don't interact aren't affected by the order in which they're specified. That seems pretty sensible to me. Jon > How is it not just intuitive? In PostgreSQL order of limit and offset clauses does not matter.So when moving from PostgreSQL direct SQL to Linq-PostgreSQL this is not intuitive. The major issue is that I have no idea how to implement orthogonal Skip() and Take() effectively (so that minimal possible number of rows are returned from server) in Linq-PostgreSQL driver. In standard SQL Postgres uses LIMIT and OFFSET are orthogonal and there is no row number function. http://www.postgresql.org/docs/8.2/interactive/queries-limit.html Andrus. On Nov 26, 11:35 am, "Andrus" <kobrule***@hot.ee> wrote: But they're not limit and offset - they're "take the next N results"> > How is it not just intuitive? > > In PostgreSQL order of limit and offset clauses does not matter. and "skip the next N results". > So when moving from PostgreSQL direct SQL to Linq-PostgreSQL this is not Did you read the description of the Skip and Take methods before using> intuitive. them? If you just assume they mean "offset" and "limit" then you'll get the wrong results, but if you read the description I believe it *is* intuitive to use them. > The major issue is that I have no idea how to implement orthogonal Skip() You can't and shouldn't implement Skip and Take orthogonally - they're> and Take() effectively (so that minimal possible number of rows are returned > from server) in > Linq-PostgreSQL driver. > In standard SQL Postgres uses LIMIT and OFFSET are orthogonal and there is > no row number function.http://www.postgresql.org/docs/8.2/interactive/queries-limit.html *meant* to interact. The developer should specify them in the appropriate order, which is almost always going to be Skip and then Take. Skip X, Take Y should translate to OFFSET X LIMIT Y Take Y, Skip X should translate to OFFSET X LIMIT min(Y-X, 0) What are you trying to achieve which is causing you difficulties? Jon Jon Skeet [C# MVP] wrote:
> On Nov 26, 11:35 am, "Andrus" <kobrule***@hot.ee> wrote: Look at it from the other side, you then might understand it. > > > How is it not just intuitive? > > > > In PostgreSQL order of limit and offset clauses does not matter. > > But they're not limit and offset - they're "take the next N results" > and "skip the next N results". Developer D wants to write a SELECT statement with LIMIT n and OFFSET m. He has to use Linq. That's the problem area the developer is in: how is a given Linq query translated to SQL? The developer knows what SQL statement he wants to get, he just wants to write the Linq query to get that SQL statement. However, it can be the developer makes a mistake because LIMIT/OFFSET have no order in the SELECT, though Take/Skip do. According to the description of Take and Skip individually, it's logical. However they're part of a bigger query. 'Take' for example results in TOP on sqlserver, and if Skip is present, it can be translated to a paging query with a CTE. Therefore, it's not irrelevant to see the query as a whole with skip/take being part of that big query, instead of seeing them as separate elements because that's not going to work out. THe problem is that the Linq query isn't translatable to SQL in a 1:1 basis. Therefore the intention of what was meant has to be understood before the SQL can be produced. As: set.Skip(n).Take(m) is equal to set.Take(n+m).Skip(n), it doesn't matter in theory which order is used for skip/take, both orders are usable to fetch the same set of rows. Jon, you can keep living at the level of Linq and dream that everything below it is just solved by default, but that's not the case. People want to express something in the Linq query in such a way that the Linq query is translated to SQL which expresses that same thing EXACTLY how they have foreseen it. Some people start at the Linq level and expect the provider to cough up the objects as stated in the manual, like you do. That's not always possible. Some people start at the SQL level and wonder how to write that query in Linq. That's also not always possible. Both sides have a big gap to cross and by stating what's in the spec or what's in MSDN isn't going to help either side one bit. > > So when moving from PostgreSQL direct SQL to Linq-PostgreSQL this Take IS going to be translated to LIMIT, if Skip isn't there. > > is not intuitive. > > Did you read the description of the Skip and Take methods before using > them? If you just assume they mean "offset" and "limit" then you'll > get the wrong results, but if you read the description I believe it > *is* intuitive to use them. So how is specifying Skip all of a sudden not making Take being translated to LIMIT ? Show quote > > The major issue is that I have no idea how to implement orthogonal It depends on how difficult you want to make it for yourself, you can> > Skip() and Take() effectively (so that minimal possible number of > > rows are returned from server) in > > Linq-PostgreSQL driver. > > In standard SQL Postgres uses LIMIT and OFFSET are orthogonal and > > there is no row number > > function.http://www.postgresql.org/docs/8.2/interactive/queries-limi > > t.html > > You can't and shouldn't implement Skip and Take orthogonally - they're > *meant* to interact. The developer should specify them in the > appropriate order, which is almost always going to be Skip and then > Take. > > Skip X, Take Y should translate to > OFFSET X LIMIT Y > > Take Y, Skip X should translate to > OFFSET X LIMIT min(Y-X, 0) specify skip after take by specifying different values for skip/take (see above) and it should result in the same query as you're requesting the same resultset in that case. FB -- ------------------------------------------------------------------------ Lead developer of LLBLGen Pro, the productive O/R mapper for .NET LLBLGen Pro website: http://www.llblgen.com My .NET blog: http://weblogs.asp.net/fbouma Microsoft MVP (C#) ------------------------------------------------------------------------ On Nov 27, 10:15 am, "Frans Bouma [C# MVP]"
<perseus.usenetNOS***@xs4all.nl> wrote: <snip> > Some people start at the Linq level and expect the provider to cough If you really think that explaining why things work the way they do> up the objects as stated in the manual, like you do. That's not always > possible. Some people start at the SQL level and wonder how to write > that query in Linq. That's also not always possible. Both sides have a > big gap to cross and by stating what's in the spec or what's in MSDN > isn't going to help either side one bit. isn't going to help anyone, then I think we disagree on so much, so fundamentally, that I don't think there's much point in us debating things. Personally, I think that explaining to "SQL heads" the way that LINQ is designed to work - including referring to the documentation - will help them write better LINQ queries. Just like explaining to "LINQ heads" how SQL works will also help those people write better SQL, and indeed will help them understand the SQL generated by their LINQ queries better. I will continue to suggest to people who are using LINQ that they should actually learn about LINQ and its design rather than just assuming it'll work like SQL. I think that will make them a lot more productive. However, I think I'll give up on trying to persuade you of that notion. Jon Jon Skeet [C# MVP] wrote:
Show quote > On Nov 27, 10:15 am, "Frans Bouma [C# MVP]" It will answer some questions perhaps, but as there's no correlation> <perseus.usenetNOS***@xs4all.nl> wrote: > > <snip> > > > Some people start at the Linq level and expect the provider > > to cough up the objects as stated in the manual, like you do. > > That's not always possible. Some people start at the SQL level and > > wonder how to write that query in Linq. That's also not always > > possible. Both sides have a big gap to cross and by stating what's > > in the spec or what's in MSDN isn't going to help either side one > > bit. > > If you really think that explaining why things work the way they do > isn't going to help anyone, then I think we disagree on so much, so > fundamentally, that I don't think there's much point in us debating > things. between the nature of how linq is based on sequences and how the corresponding SQL query looks like, it will IMHO likely only increase confusion instead of solving it. Just take a look at the initial question of this thread. The root cause of that question is how it's done in SQL. Of course, operating on the Linq level makes you THINK you can forget all about the SQL, but that's just a facade. My point is that if you explain just the Linq side of things, people will never be able to bridge over to the SQL they want to produce with the Linq query they have in mind. > Personally, I think that explaining to "SQL heads" the way that LINQ aren't you forgetting one thing, namely how linq element X is> is designed to work - including referring to the documentation - will > help them write better LINQ queries. Just like explaining to "LINQ > heads" how SQL works will also help those people write better SQL, and > indeed will help them understand the SQL generated by their LINQ > queries better. resulting in SQL element Y and vice versa, how you can achieve SQL element Y by writing Linq element X ? IMHO the thing that will help is what to write to achieve ABC in SQL. I.e. to get a subquery instead of a join, to get a join instead of a subquery, to get a scalar query in a derived table instead of in the projection (which is MUCH slower) etc. > I will continue to suggest to people who are using LINQ that they I'm not saying it should work like SQL... :/ I'm saying that it's> should actually learn about LINQ and its design rather than just > assuming it'll work like SQL. I think that will make them a lot more > productive. However, I think I'll give up on trying to persuade you of > that notion. unclear what SQL it will produce so it's not determinable when writing the Linq query what it will return, at least not always, because there's a projection involved towards SQL. Sure, the SQL might work in a lot of cases, but is it the SQL which is determined to be more efficient? Unclear, you have to examine every linq query's SQL statements. Then tell me, what's the advantage over specifying some queries in text in an ibatis.net config file? FB -- ------------------------------------------------------------------------ Lead developer of LLBLGen Pro, the productive O/R mapper for .NET LLBLGen Pro website: http://www.llblgen.com My .NET blog: http://weblogs.asp.net/fbouma Microsoft MVP (C#) ------------------------------------------------------------------------ Jon Skeet [C# MVP] wrote:
Show quote > On Nov 26, 9:04 am, "Frans Bouma [C# MVP]" Because skip/take equivalents in SQL work on the same projection> <perseus.usenetNOS***@xs4all.nl> wrote: > > <snip> > > > I tried it out to be sure, and indeed Linq to Sql generates > > two different queries (which really hurt my eyes but that's another > > story). The thing is though that on databases, people will use the > > take/skip combination to page through a bigger resultset. However > > consulting the manual for both shows no word whatsoever about the > > order in which these two statements have to be specified. > > I don't see why there'd be any explicit need for documentation on > this - it seems pretty obvious to me. If you want to skip some results > and then take a certain number of the rest, you do Skip and then Take. > If you want to take some results and then skip within that (which is a > slightly odd thing to do, admittedly) you do Skip and then Take. > > How is it not just intuitive? result, while in Linq they don't. You should look less at specs and more at what users WANT TO DO. Time and time again you post a reply with how the syntaxis is, how the specs says how it works etc. etc. All fine, but that's completely not hte point here: the point is that the user wants to PAGE through a resultset. In Firebird, Mysql, Postgresql, people have 2 keywords in the SELECT statement to do that, similar to skip/take, but the order isn't important, because both work at the same projection result. THe intention of the USER is to page. Messing the order up has no effect at compiletime, but at runtime it does. How do you PREVENT developers to make this error? Exactly, by specifying a pagenumber and a page size instead of skip/take OR by specifying that the order doesn't matter. > > I do understand the order, but it's a bit strange as well. In SQL, order by is done on a projected resultset. In Linq, this isn't> > For example, Linq apparently has no problem with an Order by placed > > in front of a where, however there IS an order in take/skip which > > are used combined as a paging mechanism. > > Filtering and ordering are effectively orthogonal, unless you can > build a filter which depends on the order. Skipping and taking are > clearly not orthogonal. always clear, because you can place an order by in a new projection scope, and you don't know IF that results in a subquery (projection scope) or not. My point was that MS should have made the same limitation to linq as ISO did to SQL: Order By is the last statement in the query. While order is important for extension methods like skip/take specified directly, it's not for order by apparently. > > Of course, this follows from the specs Yes, because you don't understand what the word 'intention' means when> > of both, but semantically, the intention of what people want to do, > > e.g. to page, shouldn't require an order in the statements for > > paging, if other elements in the query also don't really require an > > order (they do, but that's whiped under the rug) > > Elements in the query which interact are affected by the order in > which they're specified. > Elements in the query which don't interact aren't affected by the > order in which they're specified. > > That seems pretty sensible to me. a developer writes code. You don't seem to grasp the little fact that a developer who makes a mistake by writing ACB, while the intention was ABC, should be WARNED with an error or warning that ACB isn't going to work probably. Like I said: I understand the ordering of skip/take in the context of the situation where you first want to take and then want to skip (however I can't find a use case for that). What I don't understand is that 'skip' and 'take' aren't placed TOGETHER in such a way that there's no way to make a mistake when one tries to PAGE through a resultset. You seem to find it 'sensible' that people can do that. That's what I find odd. But I disgress, according to the syntax/specs everything seems logical, because it is in the spec, however precisely that isn't the point at all here. FB -- ------------------------------------------------------------------------ Lead developer of LLBLGen Pro, the productive O/R mapper for .NET LLBLGen Pro website: http://www.llblgen.com My .NET blog: http://weblogs.asp.net/fbouma Microsoft MVP (C#) ------------------------------------------------------------------------ On Nov 27, 9:33 am, "Frans Bouma [C# MVP]"
<perseus.usenetNOS***@xs4all.nl> wrote: > > I don't see why there'd be any explicit need for documentation on No - in LINQ each step effectively works off the result of the> > this - it seems pretty obvious to me. If you want to skip some results > > and then take a certain number of the rest, you do Skip and then Take. > > If you want to take some results and then skip within that (which is a > > slightly odd thing to do, admittedly) you do Skip and then Take. > > > How is it not just intuitive? > > Because skip/take equivalents in SQL work on the same projection > result, while in Linq they don't. previous step. That's one of the fundamentals of LINQ, and people should know it before they write too much code which assumes that LINQ=SQL. > You should look less at specs and more at what users WANT TO DO. Time Okay, so create a Page extension method which does a Skip then a Take.> and time again you post a reply with how the syntaxis is, how the specs > says how it works etc. etc. All fine, but that's completely not hte > point here: the point is that the user wants to PAGE through a > resultset. It's really easy to do - and I could certainly agree that it would have made logical sense to include that in .NET 3.5. However, it's *also* nice to be able to have those two operations separately. > In Firebird, Mysql, Postgresql, people have 2 keywords in Indeed.> the SELECT statement to do that, similar to skip/take, but the order > isn't important, because both work at the same projection result. THe > intention of the USER is to page. Messing the order up has no effect at > compiletime, but at runtime it does. > How do you PREVENT developers to make this error? Exactly, by So write that method, and use that everywhere. Create an FXCop rule> specifying a pagenumber and a page size instead of skip/take OR by > specifying that the order doesn't matter. suggesting that Skip/Take should be avoided. Just don't take Skip/Take away from people who want them :) > > Filtering and ordering are effectively orthogonal, unless you can There's no logical reason in LINQ *in general* to force ordering to be> > build a filter which depends on the order. Skipping and taking are > > clearly not orthogonal. > > In SQL, order by is done on a projected resultset. In Linq, this isn't > always clear, because you can place an order by in a new projection > scope, and you don't know IF that results in a subquery (projection > scope) or not. My point was that MS should have made the same > limitation to linq as ISO did to SQL: Order By is the last statement in > the query. While order is important for extension methods like > skip/take specified directly, it's not for order by apparently. the last operation. (In fact, it never is - you always want to either group or select at the end.) You always come at LINQ as if it should be the same as SQL - LINQ is more than SQL, and if that occasionally makes the SQL side less than ideal, that's a hit I'm certainly willing to take. > > That seems pretty sensible to me. I think it would be nice if LINQ to SQL barfed at runtime rather than> > Yes, because you don't understand what the word 'intention' means when > a developer writes code. You don't seem to grasp the little fact that a > developer who makes a mistake by writing ACB, while the intention was > ABC, should be WARNED with an error or warning that ACB isn't going to > work probably. executing the probably-incorrect query, but I don't think it should be a compile-time error: that would require the compiler to know *way* too much about what's going on. Any time you fancy dispensing with the ad hominem attacks, that would be fine too, by the way. > Like I said: I understand the ordering of skip/take in Again, do it yourself and publish it to the world. It's about three> the context of the situation where you first want to take and then want > to skip (however I can't find a use case for that). What I don't > understand is that 'skip' and 'take' aren't placed TOGETHER in such a > way that there's no way to make a mistake when one tries to PAGE > through a resultset. lines of code. > You seem to find it 'sensible' that people can do that. That's what I I think we're coming back to our fundamental disagreement: I don't> find odd. But I disgress, according to the syntax/specs everything > seems logical, because it is in the spec, however precisely that isn't > the point at all here. think people should expect to be able to use LINQ without making mistakes if they don't take a little bit of time to learn about it first. I don't expect that about *any* technology - why should LINQ be different? Taking just a short time to understand the overall nature of LINQ (rather than just focusing on LINQ to SQL) makes the behaviour of Skip/ Take obvious *and* enriches the developer's understanding of the bigger picture, so they can decide where LINQ is appropriate *aside* from LINQ to SQL. Jon Jon Skeet [C# MVP] wrote:
> On Nov 27, 9:33 am, "Frans Bouma [C# MVP]" <snip>>> In Firebird, Mysql, Postgresql, people have 2 keywords in Actually, the order of the operations involved does matter, it's just >> the SELECT statement to do that, similar to skip/take, but the order >> isn't important, because both work at the same projection result. THe >> intention of the USER is to page. Messing the order up has no effect at >> compiletime, but at runtime it does. that the SQL syntax allows both ways and then internally reorders the operation to perform one before the other. The postgresql link posted earlier explicity states this: "If both OFFSET and LIMIT appear, then OFFSET rows are skipped before starting to count the LIMIT rows that are returned." And in LINQ, you do not write one query, you have syntax that allows you to chain together individual operations that, as Jon pointed out, works sequentially on the previous results. SQL, however, is generally seen as one big statement and not a chain of individual operations, although internally there is some sequencing involved. As such, you can argue that OFFSET/LIMIT is basically two keywords that specify one operation, the Page operation suggested. Lasse Vågsæther Karlsen wrote:
Show quote > Jon Skeet [C# MVP] wrote: true, as it otherwise wouldn't make any sense as there's just 1> > On Nov 27, 9:33 am, "Frans Bouma [C# MVP]" > <snip> > > > In Firebird, Mysql, Postgresql, people have 2 keywords in > > > the SELECT statement to do that, similar to skip/take, but the > > > order isn't important, because both work at the same projection > > > result. THe intention of the USER is to page. Messing the order > > > up has no effect at compiletime, but at runtime it does. > > Actually, the order of the operations involved does matter, it's just > that the SQL syntax allows both ways and then internally reorders the > operation to perform one before the other. > > The postgresql link posted earlier explicity states this: > "If both OFFSET and LIMIT appear, then OFFSET rows are skipped before > starting to count the LIMIT rows that are returned." resultset the operators are applied on. > And in LINQ, you do not write one query, you have syntax that allows Yes, and as skip/take are used in that exact same fashion to achieve> you to chain together individual operations that, as Jon pointed out, > works sequentially on the previous results. SQL, however, is > generally seen as one big statement and not a chain of individual > operations, although internally there is some sequencing involved. > As such, you can argue that OFFSET/LIMIT is basically two keywords > that specify one operation, the Page operation suggested. exactly that same thing, you can't see them as separate elements IMHO. The thing is that you could argue that in linq it's all about sequences, but that's just for linq to objects. For linq to <database provider> it's different: as a projection of what linq describes is made onto SQL, and as that projection isn't a 1:1 projection, there's a lot of room for interpretating things the wrong way, or assuming things will turn out differently, simply, because the sequence-usage of linq is not present in an RDBMS in such a way that it's similar to linq. This makes it so hard to talk about linq queries which are meant for running onto a DB, simply because although it's perfectly described what the linq query is SUPPOSED to be returning, it's not said it is possible to do so. FB -- ------------------------------------------------------------------------ Lead developer of LLBLGen Pro, the productive O/R mapper for .NET LLBLGen Pro website: http://www.llblgen.com My .NET blog: http://weblogs.asp.net/fbouma Microsoft MVP (C#) ------------------------------------------------------------------------ Jon Skeet [C# MVP] wrote:
Show quote > On Nov 27, 9:33 am, "Frans Bouma [C# MVP]" With that knowledge they can't do a single thing if they are writing a> <perseus.usenetNOS***@xs4all.nl> wrote: > > > I don't see why there'd be any explicit need for documentation on > > > this - it seems pretty obvious to me. If you want to skip some > > > results and then take a certain number of the rest, you do Skip > > > and then Take. If you want to take some results and then skip > > > within that (which is a slightly odd thing to do, admittedly) you > > > do Skip and then Take. > > > > > How is it not just intuitive? > > > > Because skip/take equivalents in SQL work on the same > > projection result, while in Linq they don't. > > No - in LINQ each step effectively works off the result of the > previous step. That's one of the fundamentals of LINQ, and people > should know it before they write too much code which assumes that > LINQ=SQL. linq query which has to meet a SQL query they have in mind. > > You should look less at specs and more at what users WANT You mean to be able to just skip or just take? Sure. The sad thing is> > TO DO. Time and time again you post a reply with how the syntaxis > > is, how the specs says how it works etc. etc. All fine, but that's > > completely not hte point here: the point is that the user wants to > > PAGE through a resultset. > > Okay, so create a Page extension method which does a Skip then a Take. > It's really easy to do - and I could certainly agree that it would > have made logical sense to include that in .NET 3.5. However, it's > *also* nice to be able to have those two operations separately. that when they're used together in the same projection scope, they are resulting in a different SQL query than used separately. IMHO that sounds like they should have been tied together in a single statement. > > How do you PREVENT developers to make this error? Exactly, I'm not saying they should be dropped :). I suggest that if they are> > by specifying a pagenumber and a page size instead of skip/take OR > > by specifying that the order doesn't matter. > > So write that method, and use that everywhere. Create an FXCop rule > suggesting that Skip/Take should be avoided. Just don't take Skip/Take > away from people who want them :) used together in a query, people should not be bothered with an order, OR, if they are used in the wrong order, they should be warned. Similar to use ThenBy first, which is also not correct. As you can fetch any row set with skip first following by take, they should have forced it to have that order to avoid problems. Show quote > > > Filtering and ordering are effectively orthogonal, unless you can that's what the spec says. Ordering is always done after projection,> > > build a filter which depends on the order. Skipping and taking are > > > clearly not orthogonal. > > > > In SQL, order by is done on a projected resultset. In Linq, > > this isn't always clear, because you can place an order by in a new > > projection scope, and you don't know IF that results in a subquery > > (projection scope) or not. My point was that MS should have made > > the same limitation to linq as ISO did to SQL: Order By is the last > > statement in the query. While order is important for extension > > methods like skip/take specified directly, it's not for order by > > apparently. > > There's no logical reason in LINQ *in general* to force ordering to be > the last operation. (In fact, it never is - you always want to either > group or select at the end.) as otherwise there's no way you CAN order, you can only order resultsets. Check Warren's blog, I believe it's the last entry where he introduces a complex piece of code to move up a wrongly placed order by statement. > You always come at LINQ as if it should be the same as SQL - LINQ is It might look that way, but I'm just hammering on the essential> more than SQL, and if that occasionally makes the SQL side less than > ideal, that's a hit I'm certainly willing to take. element that when using Linq for DATABASE (!) queries, you have to realize that the target is a very expensive resource which can be brought down to its knees with the wrong queries, and this isn't obvious. When I started LLBLGen Pro I too found it logical that people would simply learn our query api and be done with it. However we quickly learned that people often think from the other side: they have a stored proc which has to be migrated to C# code (and in this context thus to a linq query) or are simply thinking in SQL as that's what they know and are wrestling with the query api because it doesn't give them the freedom a SQL statement gives them: the SQL statement is the one executed on the DB, a linq query is an abstraction layer above that, so to reach what you want, namely the SQL query, you have to know how that linq query translates to that SQL query so if you have that SQL query in your head, you can write the Linq query :) Trust me, in the newsgroups here we'll see a lot of people having a SQL statement which they want to convert to Linq queries and have a hard time. That's why I find it a bit of a dissapointment that Linq lacks the fine grained control elements to make that conversion so much more easier. Because that's what people want: they want to write a Linq query that mimics what they have in mind in SQL. > > > That seems pretty sensible to me. Don't you agree that an exception at runtime is simply too late?> > > > Yes, because you don't understand what the word 'intention' > > means when a developer writes code. You don't seem to grasp the > > little fact that a developer who makes a mistake by writing ACB, > > while the intention was ABC, should be WARNED with an error or > > warning that ACB isn't going to work probably. > > I think it would be nice if LINQ to SQL barfed at runtime rather than > executing the probably-incorrect query, but I don't think it should be > a compile-time error: that would require the compiler to know way > too much about what's going on. What's the use of a runtime exception, when code analysis could have shown a warning up front. ? > > You seem to find it 'sensible' that people can do that. Because it would help a lot of people out, that's why. :) I was that> > That's what I find odd. But I disgress, according to the > > syntax/specs everything seems logical, because it is in the spec, > > however precisely that isn't the point at all here. > > I think we're coming back to our fundamental disagreement: I don't > think people should expect to be able to use LINQ without making > mistakes if they don't take a little bit of time to learn about it > first. I don't expect that about any technology - why should LINQ be > different? stubborn with our own API as well: why do they need sql examples, if it's understandable by reading the specs/documentation? But keeping the eyes closed for that doesn't make the problem go away, only the people will go away and be dissapointed in the o/r mapping technology. There's a lot of energy to be spend on converting people over to use Linq instead of stored procedures or inline SQL strings. If it's very hard to do the more complex SQL statements, it's going to be an uphill battle. > Taking just a short time to understand the overall nature of LINQ Ok, here's one lesson for you: although Linq is a general purpose> (rather than just focusing on LINQ to SQL) makes the behaviour of > Skip/ Take obvious and enriches the developer's understanding of the > bigger picture, so they can decide where LINQ is appropriate aside > from LINQ to SQL. technique, it doesn't matter how it works for linq to objects etc., if you are going to execute the linq query on a database, as the linq query isn't the one that's being executed, the generated SQL query from that linq query is. I.o.w.: it's not as simple as you make it out to be: living on the level of Linq is nice and if you can afford to ignore the rest, why bother, but for a lot of people the database they're using in their app is a resource they can't affort to waste a lot of time on: queries have to be fast, predictable and therefore scalable. It's nice for you that you can determine what the SQL will be when you are handed a random Linq query, but most of the C# developers won't be able to. That's the main problem here, which is IMHO completely ignored by MS. FB -- ------------------------------------------------------------------------ Lead developer of LLBLGen Pro, the productive O/R mapper for .NET LLBLGen Pro website: http://www.llblgen.com My .NET blog: http://weblogs.asp.net/fbouma Microsoft MVP (C#) ------------------------------------------------------------------------ Frans,
See inline: >> Andrus, Not really, because to translate what is being done in LINQ to database >> >> Absolutely different results. > > Now that's unexpected. In database land, you can't get 2 different > results. land, you would actually perform a query on the result set of the previous query. The operations are not performed at the same time. >> Assuming the ordering is the same on each of them (because Skip and Well, that's not completely true. If you are using a clustered index in >> Take make no sense without ordering, LINQ to SQL will create an order >> for you, which irritates me to no end, but that's a separate thread), > > Why? SELECT * FROM Table without ordering has no defined ordering, so > using a limit + skip operator on that set results in a set of undefined > rows, you'll never know what rows will be returned. SQL Server, then the ordering of the result set will be on that index, however, without that index, the order is undefined (basically, however it is stored in the file). But I agree with you, in database land, it makes no sense, because you have to define an ordering. LINQ to SQL gets around this by ordering by all the columns in the table if another order is not found (I believe, I have to test that part, but without a specific ordering, it orders on every column in the table). This is why I said what I said in parenthesis. I hate the fact that Take and Skip are exposed, and assume some kind of order for you when it doesn't make sense to not have ordering. It should be mandatory to provide an order when using any of these extensions. It's also why I said "assuming the ordering is the same on each of them", because LINQ to SQL will always produce a default ordering. Show quote > Well, there is no order that they HAVE to be specified in. Yes, they > >> they will produce different results. >> >> Say your query will produce the ordered set {1, 2, 3}. Let n = 1, >> m = 2. >> >> The first query: >> >> var query = query.Skip(n).Take(m); >> >> Will return the ordered set {2, 3}, while the second query: >> >> var query = query.Take(m).Skip(n); >> >> Will return the ordered set {2}. >> >> The reason for this is that in the first query, the Skip method >> skips one element, then takes the remaining two, while in the second >> query, the first two elements are taken, and then the first one is >> skipped. > > I tried it out to be sure, and indeed Linq to Sql generates two > different queries (which really hurt my eyes but that's another story). > The thing is though that on databases, people will use the take/skip > combination to page through a bigger resultset. However consulting the > manual for both shows no word whatsoever about the order in which these > two statements have to be specified. are usually used in conjunction, but it is not a requirement that they be used in conjunction. They can be specified in any order, which is what the OP's question was arising from. He wanted to know if the order in which the operations were applied would affect the outcome, which it definitely will. > I do understand the order, but it's a bit strange as well. For Well, it depends here on what operations we are talking about. You are > example, Linq apparently has no problem with an Order by placed in > front of a where, however there IS an order in take/skip which are used > combined as a paging mechanism. Of course, this follows from the specs > of both, but semantically, the intention of what people want to do, > e.g. to page, shouldn't require an order in the statements for paging, > if other elements in the query also don't really require an order (they > do, but that's whiped under the rug) making the error of assuming that Take and Skip are both sub parts of a larger page operation, and assuming that because other elements of the operation that can be applied (i.e. where) don't require order, these don't. But that's wrong. Like you said, in db-land, not applying an order results in an undefined order in the result set. If you are going to page off that, then the DB is completely within its rights to return a differently ordered result set between subsequent queries, and your paging results will simply be wrong. You would be depending on an implementation detail that the DB would consistently return the undefined order to you. The reason Take and Skip (should) require an order is because of this fact. Think of it. The definition of Take is "take the FIRST n elements". Skip is defined as "skip the NEXT n elements". Because of FIRST and NEXT, you require order. Without knowing the order, the concepts of FIRST and NEXT do not exist. -- Show quote- Nicholas Paldino [.NET/C# MVP] - mvp@spam.guard.caspershouse.com > > FB > -- > ------------------------------------------------------------------------ > Lead developer of LLBLGen Pro, the productive O/R mapper for .NET > LLBLGen Pro website: http://www.llblgen.com > My .NET blog: http://weblogs.asp.net/fbouma > Microsoft MVP (C#) > ------------------------------------------------------------------------ Nicholas Paldino [.NET/C# MVP] wrote:
Show quote > Frans, That's not defined. There's no ruleset how to translate a Linq query> > See inline: > > > > Andrus, > > > > >> Absolutely different results. > > > > Now that's unexpected. In database land, you can't get 2 different > > results. > > Not really, because to translate what is being done in LINQ to > database land, you would actually perform a query on the result set > of the previous query. The operations are not performed at the same > time. to SQL, on the contrary, it's often a struggle how to completely grasp the intention of the linq query with SQL. So, in the case of take->skip, true, you need multiple projections and in the case of skip -> take, you need less, perhaps none. However, in the case of a select statement which specifies BOTH at once, you can't get 2 different resultsets by specifying them in a different order, see Firebird's and PostgreSql's SELECT syntax for details. Firebird: SELECT FIRST m SKIP n * FROM ... PostgreSQL: SELECT * FROM ... LIMIT m OFFSET n MySql actually does have an order, but it has just 1 clear intention: you skip n rows and take m. You can swap them, but that doesn't matter for the statement, you can't specify take m, skip n from that m MySQL: SELECT * FROM ... LIMIT n, m Oh yes, I know... this is SQL and not linq. Let's talk about that for a second. A developer writing a linq query which is executed on the database has to understand the query WILL become SQL. The translation is far from trivial, so the intention of the linq query has to be represented by the SQL query as much as possible. The developer isn't helped by fuzzy linq syntaxis, as predicting what SQL is going to be generated is then even more difficult than it is now. Take is used for TOP if Skip isn't there. Is take used for TOP IF skip is there? No, it's not. Well, not always. If Skip is there and Take is there, it should be translated into a paging query (paging is skipping rows and taking the top n rows from what's left). If paging is performed 'TOP' makes no sense. So if Take is specified first, and then Skip, the query has to use a temp table to store the taken rows and then skip the first n rows from that. Otherwise it can't always be done: TOP can't always be added to a subquery: if you sort on a column not in the subquery, you can run into errors if TOP / DISTINCT is used in the subquery. > > > Assuming the ordering is the same on each of them (because Skip That's just luck, if I recall correctly, but I have to check BOL for> > > and Take make no sense without ordering, LINQ to SQL will create > > > an order for you, which irritates me to no end, but that's a > > > separate thread), > > > > Why? SELECT * FROM Table without ordering has no defined ordering, > > so using a limit + skip operator on that set results in a set of > > undefined rows, you'll never know what rows will be returned. > > Well, that's not completely true. If you are using a clustered index > in SQL Server, then the ordering of the result set will be on that > index, however, without that index, the order is undefined > (basically, however it is stored in the file). semantics on this. The SQL standard (not obeyed by anyone in full, admitted) describes this explicitly. I don't like to hammer on standard specs that much, but this one thing is essential to understand so people don't run into problems with unexpected results. The RDBMS can for example decide to return a part of the rowset before another part because it has that one already in memory. > But I agree with you, in database land, it makes no sense, because I think it orders on every column in the projection. If it would order> you have to define an ordering. LINQ to SQL gets around this by > ordering by all the columns in the table if another order is not > found (I believe, I have to test that part, but without a specific > ordering, it orders on every column in the table). on every column in the source of the projection, it can run into issues with DISTINCT where not every column of the source is in the projection. Show quote > > > they will produce different results. Though the intention is to page, i.e. first skip, then take. When> > > > >> Say your query will produce the ordered set {1, 2, 3}. Let n = 1, > > > m = 2. > > > > >> The first query: > > > > > > var query = query.Skip(n).Take(m); > > > > >> Will return the ordered set {2, 3}, while the second query: > > > > > > var query = query.Take(m).Skip(n); > > > > >> Will return the ordered set {2}. > > > > >> The reason for this is that in the first query, the Skip method > > > skips one element, then takes the remaining two, while in the > > > second query, the first two elements are taken, and then the > > > first one is skipped. > > > > I tried it out to be sure, and indeed Linq to Sql generates two > > different queries (which really hurt my eyes but that's another > > story). The thing is though that on databases, people will use the > > take/skip combination to page through a bigger resultset. However > > consulting the manual for both shows no word whatsoever about the > > order in which these two statements have to be specified. > > Well, there is no order that they HAVE to be specified in. Yes, they > are usually used in conjunction, but it is not a requirement that > they be used in conjunction. They can be specified in any order, > which is what the OP's question was arising from. He wanted to know > if the order in which the operations were applied would affect the > outcome, which it definitely will. would you first take and then skip? With Skip(n) and Take(m), isn't: set.Skip(n).Take(m) the same as set.Take(m+n).Skip(n) ? the net result is that you get a set of rows which is limited to a size x from a potentially bigger set with size y. In other words: specifying Take first and then Skip has no real value over specifying Skip first and then Take. Mind you: the intention is to get a resultset which has a fixed size and is a subset from a bigger set. As some databases offer you this in the SELECT statement, one could argue (and there are no rules for this, so why not!) that Skip and Take, IF specified both in the same projection scope, are translated into the parameters for the LIMIT/OFFSET etc. keywords of the SELECT statement of the projection scope they're specified on. Show quote > > I do understand the order, but it's a bit strange as well. For Why else would you want to use take and skip in the same projection> > example, Linq apparently has no problem with an Order by placed in > > front of a where, however there IS an order in take/skip which are > > used combined as a paging mechanism. Of course, this follows from > > the specs of both, but semantically, the intention of what people > > want to do, e.g. to page, shouldn't require an order in the > > statements for paging, if other elements in the query also don't > > really require an order (they do, but that's whiped under the rug) > > Well, it depends here on what operations we are talking about. You > are making the error of assuming that Take and Skip are both sub > parts of a larger page operation, and assuming that because other > elements of the operation that can be applied (i.e. where) don't > require order, these don't. scope? Their sole purpose is to limit the resultset to a fixed set and to specify where to start in the overall resultset for the subset to return. > But that's wrong. Like you said, in db-land, not applying an order 'Order' in my remark above was about skip first/take after that, not> results in an undefined order in the result set. If you are going to > page off that, then the DB is completely within its rights to return > a differently ordered result set between subsequent queries, and your > paging results will simply be wrong. You would be depending on an > implementation detail that the DB would consistently return the > undefined order to you. about an ORDER BY added to the query by the linq provider. Sorry about that confusion. > The reason Take and Skip (should) require an order is because of this Though I've shown above that you can use Take and Skip in any order to> fact. Think of it. The definition of Take is "take the FIRST n > elements". Skip is defined as "skip the NEXT n elements". Because of > FIRST and NEXT, you require order. Without knowing the order, the > concepts of FIRST and NEXT do not exist. obtain the same rows. The INTENTION of the developer is to get row n till n+m from a set of rows of size m+k, agreed? Why else would Take and Skip show up in the same projection scope. FB -- ------------------------------------------------------------------------ Lead developer of LLBLGen Pro, the productive O/R mapper for .NET LLBLGen Pro website: http://www.llblgen.com My .NET blog: http://weblogs.asp.net/fbouma Microsoft MVP (C#) ------------------------------------------------------------------------ Andrus <kobrule***@hot.ee> wrote:
> Are Linq-SQL methods commutative ? Different results - the first gives results (zero-based) n to n+m-1.> Should the following queries return same or different results ? > > var query = query.Skip(n).Take(m); > > var query = query.Take(m).Skip(n); The second gives results 0 to Min(n-1, m-1). -- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet World class .NET training in the UK: http://iterativetraining.co.uk |
|||||||||||||||||||||||