Friday 6 September 2013

Migrating from SQL Server to Redshift

An overview of the work required to migrate our large SQL Server data warehouse to Amazon Redshift. We faced and resolved a number of issues, particularly around the different Unicodes required by each database.


  1. Thanks for the help. We are also moving from Sql Server to Redshift. Have you changed you export process since the ACCEPTINVCHARS option has been added to the COPY command?

  2. Hi Brock,

    I'd like to think I had some influence on the addition of ACCEPTINVCHARS! I'm sure others were bugging them about it though.

    It certainly resolves my issue with 0xBFEFEF but I haven't tested NUL or DEL chars as my SQLCMD extracts don't contain them.

    Also, FYI, they've added support for LZO compressed sources so I'm experimenting with that to see if it speeds up the extracts at all.


  3. Any reason why you did not consider Vectorwise or Paraccel?

    Full disclosure: I work for Actian.

  4. Hassan,

    Great question! Generally the answer to both is that I'm consulting for a start-up and we need to move very quickly. Specifically we need to understand the product cost upfront and conduct a POC without wasting a lot of time on sales 'engagement'.

    ParAccel was not a part of Actian when we did our initial evaluations. I got the clear impression that they were focussed on larger accounts and larger data sizes than ours.

    I did download VectorWise but found it very difficult to get it going properly on my server (Windows sadly). When I did get it running I found the Ingres tools amazingly bad (they seem like Delphi apps from the 90s). Plus the database does not (AFAICT) do any kind of compression. Plus there is no clearly posted pricing data.

    Redshift gives us a few things that a very attractive: decent (not amazing) performance, Postgres compatibility, clear pricing, the ability to scale up in small increments, etc. Neither VectoWise nor ParAccel nor any other Actan products is available on the AWS Marketplace (AFAIK). I honestly can't imagine why that is. Even the lumbering giant SAP has got their HANA database on there!

    Over the medium term my feeling is that Hadoop is eating the database market and the first course is MPP analytic databases. I hope Actian is working on something that runs in Hadoop (without requiring customer SerDe) as a huge priority.

  5. Awesome info.. thanks for sharing this.

    Here is another link incase some one looking step-by-step instructions


Disqus for @joeharris76