<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Data Deduplication in Relational Databases</title>
	<atom:link href="http://pavel.surmenok.com/2015/05/25/data-deduplication-in-relational-databases/feed/" rel="self" type="application/rss+xml" />
	<link>http://pavel.surmenok.com/2015/05/25/data-deduplication-in-relational-databases/</link>
	<description></description>
	<lastBuildDate>Sat, 13 Apr 2019 17:14:00 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
	<item>
		<title>By: surmenok</title>
		<link>http://pavel.surmenok.com/2015/05/25/data-deduplication-in-relational-databases/#comment-48</link>
		<dc:creator>surmenok</dc:creator>
		<pubDate>Wed, 27 May 2015 04:23:00 +0000</pubDate>
		<guid isPermaLink="false">http://pavel.surmenok.com/?p=129#comment-48</guid>
		<description><![CDATA[Performance can be weird, agreed. CTE are often used to force SQL Server to use better query plan. But it is a very simple case. Usually a query has several CTE in chain.
Thanks for an idea of max(payment).
Yes, I understand that IDs must increase over time, that&#039;s why I made it an IDENTITY field in an example. In some cases, when an application inserts records in wrong order, you have to use something else, some kind of timestamp perhaps.]]></description>
		<content:encoded><![CDATA[<p>Performance can be weird, agreed. CTE are often used to force SQL Server to use better query plan. But it is a very simple case. Usually a query has several CTE in chain.<br />
Thanks for an idea of max(payment).<br />
Yes, I understand that IDs must increase over time, that&#8217;s why I made it an IDENTITY field in an example. In some cases, when an application inserts records in wrong order, you have to use something else, some kind of timestamp perhaps.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Evgeny Hramov</title>
		<link>http://pavel.surmenok.com/2015/05/25/data-deduplication-in-relational-databases/#comment-47</link>
		<dc:creator>Evgeny Hramov</dc:creator>
		<pubDate>Wed, 27 May 2015 04:02:00 +0000</pubDate>
		<guid isPermaLink="false">http://pavel.surmenok.com/?p=129#comment-47</guid>
		<description><![CDATA[CTE are strange). In some cases they are fast, in some - very slow. And if you want join cte itself twice  it may be bad performance idea. You can use simplier

SELECT TransactionID, max(PaymentID)m_id
  FROM Payment
  GROUP BY TransactionID

to select about *last* records. And you of course know, that ids mustn&#039;t go straight 1,2,3,4... And bigger ID doesn&#039;t mean later record.]]></description>
		<content:encoded><![CDATA[<p>CTE are strange). In some cases they are fast, in some &#8211; very slow. And if you want join cte itself twice  it may be bad performance idea. You can use simplier</p>
<p>SELECT TransactionID, max(PaymentID)m_id<br />
  FROM Payment<br />
  GROUP BY TransactionID</p>
<p>to select about *last* records. And you of course know, that ids mustn&#8217;t go straight 1,2,3,4&#8230; And bigger ID doesn&#8217;t mean later record.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
