Monday, June 29, 2009

Approaches to "UPSERT"

This week in the Database Programmer we look at something called an "UPSERT", the strange trick where an insert command may magically convert itself into an update if a row already exists with the provided key. This trick is very useful in a variety of cases. This week we will see its basic use, and next week we will see how the same idea can be used to materialize summary tables efficiently.

An UPSERT or ON DUPLICATE KEY...

The idea behind an UPSERT is simple. The client issues an INSERT command. If a row already exists with the given primary key, then instead of throwing a key violation error, it takes the non-key values and updates the row.

This is one of those strange (and very unusual) cases where MySQL actually supports something you will not find in all of the other more mature databases. So if you are using MySQL, you do not need to do anything special to make an UPSERT. You just add the term "ON DUPLICATE KEY UPDATE" to the INSERT statement:

insert into table (a,c,b) values (1,2,3)
    on duplicate key update
     b = 2,
     c = 3

The MySQL command gives you the flexibility to specify different operation on UPDATE versus INSERT, but with that flexibility comes the requirement that the UPDATE clause completely restates the operation.

With the MySQL command there are also various considerations for AUTO_INCREMENT columns and multiple unique keys. You can read more at the MySQL page for the INSERT ... ON DUPLICATE KEY UPDATE feature.

A Note About MS SQL Server 2008

MS SQL Server introduced something like UPSERT in SQL Server 2008. It uses the MERGE command, which is a bit hairy, check it out in this nice tutorial.

Coding a Simpler UPSERT

Let us say that we want a simpler UPSERT, where you do not have to mess with SQL Server's MERGE or rewrite the entire command as in MySQL. This can be done with triggers.

To illustrate, consider a shopping cart with a natural key of ORDER_ID and SKU. I want simple application code that does not have to figure out if it needs to do an INSERT or UPDATE, and can always happily do INSERTs, knowing they will be converted to updates if the line is already there. In other words, I want simple application code that just keeps issuing commands like this:

INSERT INTO ORDERLINES
       (order_id,sku,qty)
VALUES 
       (1234,'ABC',5)

We can accomplish this by a trigger. The trigger must occur before the action, and it must redirect the action to an UPDATE if necessary. Let us look at examples for MySQL, Postgres, and SQL Server.

A MySQL Trigger

Alas, MySQL giveth, and MySQL taketh away. You cannot code your own UPSERT in MySQL because of an extremely severe limitation in MySQL trigger rules. A MySQL trigger may not affect a row in a table different from the row originally affected by the command that fired the trigger. A MySQL trigger attempting to create a new row may not affect a different row.

Note: I may be wrong about this. This limitation has bitten me on several features that I would like to provide for MySQL. I am actually hoping this limitation will not apply for UPSERTs because the new row does not yet exist, but I have not had a chance yet to try.

A Postgres Trigger

The Postgres trigger example is pretty simple, hopefully the logic is self-explanatory. As with all code samples, I did this off the top of my head, you may need to fix a syntax error or two.

CREATE OR REPLACE FUNCTION orderlines_insert_before_F()
RETURNS TRIGGER
 AS $BODY$
DECLARE
    result INTEGER; 
BEGIN
    SET SEARCH_PATH TO PUBLIC;
    
    -- Find out if there is a row
    result = (select count(*) from orderlines
                where order_id = new.order_id
                  and sku      = new.sku
               )

    -- On the update branch, perform the update
    -- and then return NULL to prevent the 
    -- original insert from occurring
    IF result = 1 THEN
        UPDATE orderlines 
           SET qty = new.qty
         WHERE order_id = new.order_id
           AND sku      = new.sku;
           
        RETURN null;
    END IF;
    
    -- The default branch is to return "NEW" which
    -- causes the original INSERT to go forward
    RETURN new;

END; $BODY$
LANGUAGE 'plpgsql' SECURITY DEFINER;

-- That extremely annoying second command you always
-- need for Postgres triggers.
CREATE TRIGGER orderlines_insert_before_T
   before insert
   ON ORDERLINES
   FOR EACH ROW
   EXECUTE PROCEDURE orderlines_insert_before_F();

A SQL Server Trigger

SQL Server BEFORE INSERT triggers are significantly different from Postgres triggers. First of all, they operate at the statement level, so that you have a set of new rows instead of just one. Secondly, the trigger must itself contain an explicit INSERT command, or the INSERT never happens. All of this means our SQL Server example is quite a bit more verbose.

The basic logic of the SQL Server example is the same as the Postgres, with two additional complications. First, we must use a CURSOR to loop through the incoming rows. Second, we must explicitly code the INSERT operation for the case where it occurs. But if you can see past the cruft we get for all of that, the SQL Server exmple is doing the same thing:

CREATE TRIGGER upsource_insert_before
ON orderlines
INSTEAD OF insert
AS
BEGIN
    SET NOCOUNT ON;
    DECLARE @new_order_id int;
    DECLARE @new_sku      varchar(15);
    DECLARE @new_qty      int;
    DECLARE @result       int;

    DECLARE trig_ins_orderlines CURSOR FOR 
            SELECT * FROM inserted;
    OPEN trig_ins_orderlines;

    FETCH NEXT FROM trig_ins_orderlines
     INTO @new_order_id
         ,@new_sku
         ,@new_qty;

    WHILE @@Fetch_status = 0 
    BEGIN
        -- Find out if there is a row now
        SET @result = (SELECT count(*) from orderlines
                        WHERE order_id = @new_order_id
                          AND sku      = @new_sku
                      )
    
        IF @result = 1 
        BEGIN
            -- Since there is already a row, do an
            -- update
            UPDATE orderlines
               SET qty = @new_qty
             WHERE order_id = @new_order_id
               AND sku      = @new_sku;
        END
        ELSE
        BEGIN
            -- When there is no row, we insert it
            INSERT INTO orderlines 
                  (order_id,sku,qty)
            VALUES
                  (@new_order_id,@new_sku,@new_qty)
            UPDATE orderlines

        -- Pull the next row
        FETCH NEXT FROM trig_ins_orderlines
         INTO @new_order_id
             ,@new_sku
             ,@new_qty;

    END  -- Cursor iteration

    CLOSE trig_ins_orderlines;
    DEALLOCATE trig_ins_orderlines;

END

A Vague Uneasy Feeling

While the examples above are definitely cool and nifty, they ought to leave a certain nagging doubt in many programmers' minds. This doubt comes from the fact that an insert is not necessarily an insert anymore, which can lead to confusion. Just imagine the new programmer who has joined the team an is banging his head on his desk because he cannot figure out why his INSERTS are not working!

We can add a refinement to the process by making the function optional. Here is how we do it.

First, add a column to the ORDERLINES table called _UPSERT that is a char(1). Then modify the trigger so that the UPSERT behavior only occurs if the this column holds 'Y'. It is also extremely import to always set this value back to 'N' or NULL in the trigger, otherwise it will appear as 'Y' on subsequent INSERTS and it won't work properly.

So our new modified explicit upsert requires a SQL statement like this:

INSERT INTO ORDERLINES
       (_upsert,order_id,sku,qty)
VALUES
       ('Y',1234,'ABC',5)

Our trigger code needs only a very slight modification. Here is the Postgres example, the SQL Server example should be very easy to update as well:

   ...trigger declration and definition above
   IF new._upsert = 'Y'
      result = (SELECT.....);
      _upsert = 'N';
   ELSE
      result = 0;
   END IF;
   
   ...rest of trigger is the same

Conclusion

The UPSERT feature gives us simplified code and fewer round trips to the server. Without the UPSERT there are times when the application may have to query the server to find out if a row exists, and then issue either an UPDATE or an INSERT. With the UPSERT, one round trip is eliminated, and the check occurs much more efficiently inside of the server itself.

The downside to UPSERTs is that they can be confusing if some type of explicit control is not put onto them such as the _UPSERT column.

Next week we will see a concept similar to UPSERT used to efficiently create summary tables.

55 comments:

tforster said...

Thanks for the great post. A few times over the past years I've felt like I was cheating by combining an INSERT and UPDATE into a single sproc. I don't feel so bad now seeing that MySQL actually supports it. Unfortunately I use SQL Server 200x so I can't take advantage of that.

One approach I've used before is to explicitly DELETE the row and then perform an INSERT. If the row doesn't exist no exception is raised and the sproc continues with the INSERT regardless.

I'm sure this method must have some performance hits and of course it's useless if the primary key is an identity column.

I would be curious to know what sort of performance hit I'm incurring using this technique. Can you comment?

kovica said...

I don't know about other database engines, but in DB2 you can do that by using MERGE statement. If a row exists then you can do an update otherwise an insert.

KenDowns said...

@tfor: The UPSERT, as presented in the OP, is the most efficient way to do the operation. All other approaches perform less well. So the performance "hit" is actually a performance boost. This is because it is most efficient to do the check for the existing row in the server itself, inside the trigger. All application-based approaches require more round trips to the server, so it is they who take the hit.

As for identity columns, they are not indicated in all cases, though many programmers use them as if they were. That topic is too large for a comment, but I have several posts on the subject.

Finally, Check out the MySQL link in the OP for their code sample that works with an IDENTITY column.

KenDowns said...

@Kovica, MS SQL Server also supports MERGE as of SQL Server 2008.

tforster said...

@KenDowns, In my case everything takes place in the sproc. E.g. something like:

DELETE FROM MyTable WHERE id = @id
INSERT INTO MyTable (id, c1, c2) VALUES (@id, @v1, @v2)

SQL Server doesn't complain if I attempt to delete a non existent row so if the row already exists the effective behaviour is an update and if it didn't exist the effective behaviour is an insert.

The SQL code for this approach is dead simple so I'm wondering what the downsides are compared to the MERGE method? For small tables with infrequent updates I would suspect this would be a viable approach wouldn't it?

voretaq7 said...

Not to defend MySQL any more than I have to, but they support "INSERT ... ON DUPLICATE KEY UPDATE ..." syntax, which buys you an upsert without triggers.

Also not to bash Postgres any more than I have to, but they should really get around to supporting MERGE one of these days -- it's in SQL2003 (nudge nudge hint hint whine whine...)

antimatroid said...

Your SQL Server example is unnecessarily complex:

update schema.my_table set ... where x = @x and y = @y ... z = @z
if @@rowcount = 0 then
insert into schema.my_table (...) values (...)

KenDowns said...

@tfor: There are two possible issues with the insert/delete method, and the correct solution is context-dependent.

First, if you are deleting from a parent table, you ought to get a constraint violation on the delete.

Second, the Delete/Insert operation is two disk writes, as opposed to a read plus a single write, which is more efficient.

KenDowns said...

@animatroid: nice.

--DD said...

SQLite has the equally useful 'insert or replace' statement (see http://www.sqlite.org/lang_insert.html for details)

El Hombre Gris said...

There is also the non-standard REPLACE in MySQL, that works like an INSERT but deletes the conflicting row before inserting the new one.

http://dev.mysql.com/doc/refman/5.0/es/replace.html

Chad Crawford said...

@antimatroid- your example works if there is only a single row being inserted or if all the rows need to be either inserted or updated. If there are a mix of inserts and updates (which there might be if you are doing an INSERT INTO... SELECT... FROM), you will only update the existing and skip the inserts. There is still a way to avoid the cursor though using an inner and a left join:
UPDATE orderlines
SET qty = inserted.qty
FROM inserted
INNER JOIN orderlines ON orderlines.order_id = inserted.order_id
AND orderlines.sku = inserted.sku;

INSERT INTO orderlines (order_id,sku,qty)
SELECT order_id, sku, qty
FROM inserted
LEFT JOIN orderlines ON orderlines.order_id = inserted.order_id
AND orderlines.sku = inserted.sku;
where orderlines.order_id IS NULL
and orderlines.sku IS NULL

pgeo said...

Not to bash MySQL, but its syntax is at least strange since you have to repeat the values two times. The proof that it's prone to errors is the mistake in the first example in the article:

insert into table (a,c,b) values (1,2,3)
on duplicate key update
b = 2,
c = 3

It will insert value 2 to c and 3 to b, but it will update b to 2 and c to 3....

Patrick Wolf said...

Oracle also supports the MERGE command already for a long time.

Patrick

Robert Treat said...

@Michael The Postgres developers have agreed in principal to support merge, there is even a mapped out syntax and many details of the implementation have been discussed; all that is need now is for some ambitious C coder to show up and start working on a patch. Know anyone? You can find the details on the Postgres TODO list, just scroll down to "Add SQL-standard MERGE/REPLACE/UPSERT command"

Anonymous said...

I do an Update, Delete, Insert. This is necessary because we use staging tables of tens of thousands of rows. So you have to do something like this.

-- Update existing rows
UPDATE Dest
SET Col1 = s.Col1,
Col2 = s.Col2
FROM Dest d INNER JOIN Source s
ON d.id = s.d_id

-- Delete existing rows from the
-- source table
DELETE FROM Source
FROM Source s INNER JOIN Dest d
ON d.id = s.d_id

-- Insert the remaining.
INSERT INTO Dest (Col1, Col2)
SELECT Col1, Col2
FROM Source

KenDowns said...

Anonymous: your comment and approach illustrate a basic reality of databases and programming: rules of thumb are great, but the particulars of the situation have final say.

In the blog I try to provide examples and techniques, but that never means a particular technique is the *only* way, there are always other ways to do it.

Anonymous said...

Wherefore art thou, Ken? We're dying for more content!

KenDowns said...

Anonymous: Thanks for the compliment, this morning I'm doing my first post after about 18 months.

David Underhill said...

Thanks for the PostgreSQL trigger Ken. It is still relevant today (a year and a half later!) since UPSERT is still on the PostgreSQL Todo list.

For those using Python, check out the generic PostgreSQL UPSERT method I wrote. I find it more convenient than writing a trigger like Ken's for every table.

Unknown said...

Most ideal approach to Solve MySQL Max User Connection Error through MySQL Technical Support
The MySQL database is both open source and simple to utilize yet the vast majority of the clients confronting issue when they execute embed as well as refresh of a large number of columns consistently and around then they need to confront this specialized hiccups. Well! We encourage you to tackle this issue through MySQL Remote Support or MySQL Remote Service. We give the help which flawlessly meets the specialized and operational administration desires. So rapidly take our help and investigate the best help with our specialists.
For More Info: https://cognegicsystems.com/
Contact Number: 1-800-450-8670
Email Address- info@cognegicsystems.com
Company’s Address- 507 Copper Square Drive Bethel Connecticut (USA) 06801

James Zicrov said...

Thank you so much for providing information about SSIS Upsert and its uses.

SSIS upsert

Tuckerjackk said...

If you want to protect your system online or locally from any unforeseen events   Norton.com/Setup   is is a must have software in your PC or Mac.
Install your office.com/Setup by downloading now. Microsoft Office applications are a complete package if multiple integrations like Microsoft Office, Microsoft Power Point, Microsoft Excel etc. All of these programs have their own features and speciality and are used in a lot of industries, small organizations and big organizations.

Tuckerjackk said...

McAfee.com/Activate Since the world is developing each day with new computerized advances, digital dangers, malware, information, and harming diseases have additionally turned out to be increasingly more progressed with every day. These digital contamination's harm a gadget or documents in different ways. McAfee.com/Activate  follows the concept of refine your system, you don’t need to worry about data loss or system failure because of the malfunctions. McAfee.com/Activate   works finely on every system including android and ios and supports device like, computer, laptops, mobile phones and tablets.  

Maria Garcia said...

I am very thankfull to you for sharing this fantastic article , I appereciate your work .

mcafee.com/activate , mcafee.com/activate , mcafee.com/activate

Mike Jones said...

It read your article every thing is clear , Thanks for sharing.

office.com/setup , mcafee.com/activate

Jatin Sethi said...

good morning wishes images
happy anniversary didi jiju

McAfee Pro said...

McAfee total protection is a popular product worldwide to secure your computer and mobile devices from viruses and malware protection. Anyone can download the world’s most trusted antivirus from McAfee's official website and login in a few simple steps. I am also using this software for the last 5 years and getting live support with easy access to Mcafee Account Login to solve all issues myself. Visit to this(https://mcafeepro.com/) site for getting an instant solution.

ethanbunch said...

McAfee Total Protection provides one easy-to-use subscription with a variety of antivirus and privacy features to protect you and your family's PCs, Mac computers, and mobile devices.For more information visit our website! https://mcafeepro.com/

Getsupport Pro said...

Getsupportpro is a kind of digital directory, Provides the all type of local and global verified business information, toll-free numbers, and customer services in one step.

richard bryan said...
This comment has been removed by the author.
richard bryan said...

at SynergisticIT offer the best 5 minute aws test

erectile dysfunction remedies said...

I have read so many content regarding the blogger lovers but this piece of writing is in fact a nice post, keep it up.

ed medication said...

Hi, I do believe this is a great website. I stumbledupon it ;) I'm going to revisit yet again since i have book marked it. Money and freedom is the greatest way to change, may you be rich and continue to help others.

David Martin said...

You might be aware of using the Paypal services and it is not wrong to say that Paypal is one of the best, user-friendly payment platforms. Moreover, it facilitates payments that take place via online transfers between different parties. The PayPal credit login platform allows every individual to create an account on its platform that links the user to its account.

For more information visit site :- Paypal Login
Spectrum login
Quickbooks Login
Paypal Login
Xoom Login
Amazon.com/mytv
Venmo Login

techtalk said...

Get your Microsoft 365 Apps, Download/install or reinstall Office 2019.Enter Enter 25 digit office product key at office.com/setup or Microsoft365.com/setup.

www.office.com/setup | enter office setup product key | office setup | www.norton.com/setup

OFFICE.COM/SETUP said...


The web address for redeeming an office product key and downloading, installing, and activating an office product setup is Office.com/setup.
We've already written a few posts on how to fix these problems and easily install Office on your PC/Mac by simply clicking the below links:

www.office.com/setup
office.com/setup
office.com/setup
office.com/setup login

office com setup said...

There are products for students, professionals, home users as well as business and these can be easily installed/deployed with the help of the Office product key. Once you decide to purchase the Office product suite of your choice, it becomes easier for you to set up the product on your device/s. So, let’s walk you through the complete step-by-step procedure to set up Office.
office.com/setup
www.office.com/setup

Microsoft 365 is a collaboration cloud that allows you to follow your passion while still running your company. Microsoft 365 is more than just Word, Excel, and PowerPoint; it combines best-in-class productivity applications with powerful cloud services, mobile management, and enhanced security in one seamless experience.
Get Microsoft365 Home & Student installed through below links:
office.com/setup home and student 2019

office.com/setup home and business 2019


Microsoft Office Works said...

How to Get Microsoft Office works for Windows?
Click Below Links:
office.com/setup
www.office.com/setup
buy microsoft365
msofficeworks
microsoft365.com

setup.office.com said...

Hey Everyone,Are you getting errors when you try to activate your Microsoft office activation subscription?
Then read on to get step by step guide to solve Microsoft office activation errors. Protect your windows, Mac, and Android devices. Get and easily run Anti Viruses and Learn how to download, install, and activate your Office setup on your computer and other devices by the easiest steps. Just visit our website and Troubleshoot the most commonly occuring errors for Microsoft office365 by the easiest steps. Install, and activate your Office setup on your computer and other devices.

office.com/setup
www.office.com/setup

Anonymous said...

dqqw qw
qwdwq qwd wqdqw
wqdqw
onlyfans premium account generator

captino1253985 said...

Funny shirt
Funny Shirt quotes

Paypal Login said...

Thanks for sharing that information, this is great full to us. Please follow this link to know more about PayPal Account Login.
PayPal Account Login is one of the best, user-friendly payment platforms. Moreover, it facilitates payments that take place via online transfers between different parties.

Mark Watson said...

Open the browser and type in Spectrum router’s IP address in the address bar. You can find it at the back of the router. A new web page will appear for spectrum router login. When you log in the first time, you can do so by using the standard username and password that it’s set to. Once you’ve logged in, however, you can change these settings to make the password harder and more secure.

Janny kayle said...

Are you getting errors when you try to activate your Microsoft office activation subscription? Then read on to get step by step guide to solve Microsoft office activation errors. Install,

and activate your Office setup on your computer and other devices.
office.com/setup
www.office.com/setup
office.com/setup
www.office.com/setup

Janny kayle said...

Microsoft 365 provides more security than a basic Office 365 subscription, with features such as Enhanced Security Features, wireless cleaning, and app safety and security that prevents

duplication.. Instead of using Microsoft Word and Excel, Microsoft 365 offers a range of useful features.
microsoft365.com/setup
office.com/renew
office365.com/setup
microsoft365.com/setup
office.com/renew
office365.com/setup

Janny kayle said...

Home and Student 2019 is designed for students and families who choose to use classic Office software such as Excel, Word, and PowerPoint on Windows and Mac. It’s a one-time

purchase that can be used at home or in the classroom on one PC or Mac.
office.com/setup home and student 2019
office.com/setup home and business 2019
office.com/setup home and student 2019
office.com/setup home and business 2019

Janny kayle said...

Visit our website and Troubleshoot the most commonly occuring errors of Microsoft office by the easiest steps. Install, and activate your Office setup on your computer and other

devices.
office.com/setup
www.office.com/setup
office.com/setup
www.office.com/setup

JOE MILLER said...


Thank you for this brief explanation and very nice information. This post gives truly quality information. Hope to see you again. I find that this post is really amazing.

Microsoft 365 offers excellent interoperability, allowing users to switch between operating systems as needed. It works with a variety of operating systems, including Windows and Mac. It's wonderful and unique because of its interactive and inventive usage, accessibility, and adaptability.
Microsoft365.com/setup
Microsoft365.com/setup
Microsoft365.com/setup

Unknown said...

On the web address bar, enter the https://ij.start.cannon URL and press Enter key. The http://ij.start.cannon is a Canon printer drivers download site that offers the latest software and drivers to download on your operating system. https //ij.start.cannon

If you don’t get any cable to connect MG2522 to your PC or laptop, ensure you purchase a USB 2.0 A/B cable. First, visit canon.com/ijsetup mg2522 | canon.com/ijsetup/mg2522

크레이지슬롯 said...

Your writing is perfect and complete. 크레이지슬롯 However, I think it will be more wonderful if your post includes additional topics that I am thinking of. I have a lot of posts on my site similar to your topic. Would you like to visit once?

baccaratsite.top said...

바카라사이트
Your web site is great. I was extremely touched by this post. I'm so pleased to locate the write-up I have actually been seeking for a long period of time. I covet the amount of individuals sympathize with your writing. You discover a great deal of expertise from my writing.

Anonymous said...

Amazing website, Love it. Great work done. Nice website. Love it. This is really nice.
locast.org/activate
hbomax/tvsignin
disneyplus.com/begin
showtimeanytime.com/activate

123.hp.com said...

paypal login is one of the best, user-friendly payment platforms. Moreover, it facilitates payments that take place via online transfers between different parties.
paypal login
You can also go through this latest article ij.start.cannon that helps you determine how to download canon Printer drivers.

Anonymous said...

تسليك مجاري بالدمام
تسليك مجاري بالخبر
تسليك مجاري بالقصيم
تسليك مجاري بالاحساء
تسليك مجاري بالجبيل
تسليك مجاري بالقطيف
تسليك مجاري بابها
تسليك مجارى بخميس مشيط