Thursday, May 31, 2012

SSIS Error - Invalid Number ORA-01722

After 5 years,  a select statement on a Oracle 10 database starts to fail...with the error message ORA-01722: Invalid Number.

Hum, strange error message and not really usefull.

I read lots of post and sometimes it was due to an issue with the connector (http://www.attunity.com/forums/microsoft-ssis-oracle-connector/known-limitation-number-fields-1305.html), due  to datatype mismatch (http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/9acb1c36-bfaf-40eb-bb6e-3a91eeea96ec) or due to other reasons.

In my case, the issue cames from a TO_NUMBER casting that we did to convert a string datatype to a number...I know it'sstupid to create keys as string but I'm not responsible of that ;-)

So let's continue the explanation...first I started to execute the query by removing all the selected fields.
The query ran fine.
Than I added, one by one the selected fields till the error occured again. Like this I was able to identify the column that caused the issue.

Second I remarked that I did a conversion using TO_NUMBER...May it be possible that some values in the source database are not numbers?

I executed again the query without converting the values and was able then to see that one row had the value 'PHONECALL' as key number! Very funny. So I have identified the bug.

Now go to the solution...

1) The workaround: adding a selection criteria in the where clause that avoid selecting rows where the values length is smaller than the expected length.
2) The solution: Asking to the DBA to create a function that detect if a value is a number or not like here follow:

CREATE OR REPLACE FUNCTION is_number
( p_str IN VARCHAR2 )  
RETURN VARCHAR2 IS l_num NUMBER;
BEGIN  
l_num := to_number( p_str );  
RETURN 'Y';
EXCEPTION      
RETURN 'N';
END is_number;

Wednesday, May 23, 2012

SSIS vs. SQL Stored Procedure

Recently I was asking by the customer to redesign a sequence of stored procedures that were used to load and transform data. This because the performance was not as expected due to the increasing of the mount of data.

Here the observations I will share with you:

  • Use SSIS to extract and load data from one or mutliple sources to one destination or multiple destinations.
  • Use a stored procedure to update the data instead of using e.g. a SCD component.
  • Use SQL Statement for grouping, filtering or sorting a query result instead of e.g. a Sort component
  • If the data to join comes from different sources then extract and load first the data in tables of the same e.g Staging area and then use a SQL Select statement where the join conditions will be indexed.
A right indexation and statistics will be the key of success!

Thursday, March 29, 2012

How To: Run a package having an Excel connector in BIDS 2008 and above 64bits

Hi,

What do you have to change to be able to execute a package where the error message says:

Error: 0xC020801C at Data Flow Task, Excel Source [1]: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager "Excel Connection Manager" failed with error code 0xC0202009. There may be error messages posted before this with more information on why the AcquireConnection method call failed.


?

Most of time, Excel is installed as a 32bits software but you try to execute a package in a 64bits BIDS installation.

To avoid this stupid error message, you have set the Run64BitRuntime property to false as shown in the below picture:



Wednesday, May 12, 2010

SSIS - Tinyint datatype issue

The stage I prefer is the design of the database. Using the most appropriate datatype, creating triggers, indexes, etc. I like that!

Recently I encountered an issue with SSIS where a field having the TINYINT SQL datatype was converted by SSIS in unsigned unicode data type. Of course, the destination field and this field did not match anymore.

The solution is to use a Derived Column component to convert back the data type to the signed unicode data type.

Thanks to the post of Simons, I was able to find the workaround:
http://sqlblogcasts.com/blogs/simons/archive/2006/02/24/SSIS-data-flows-and-tinyint.aspx

Wednesday, January 27, 2010

Time dimension in SSAS

In SSAS 2005 and 2008, it exists 4 types of built-in time dimensions. All are based on the server time.

The problem with these server time dimension is the flexibility. Indeed no way to customize the members name or change the name of these. You will always have something like 'Friday 1 January 2010' as day.



To have a more usable time dimension, you will have to create a physical time dimension and import it into your datasource view.



This time dimension will contain then the naming you want.



Here the code I built to fill in my Calendar table which is the source table to my time dimension.



--Creation of the table

CREATE TABLE [dbo].[Dim_Calendar](
[DateKey] [int] NOT NULL, --Surrogate key
[FullDate] [datetime] NULL, --DD/MM/YYYY 00:00:00.000
[WeekNumInYear] [tinyint] NULL, -- The order number of the week in the year (gregorian calendar)
[Month] [tinyint] NULL, -- Month number (1-->12)
[MonthAbbrev] [char](3) NULL, -- Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Spe,Oct,Nov,Dec
[MonthBeginDateKey] [int] NULL, -- The surrogate Key of the first day in the month
[Quarter] [tinyint] NULL, -- Quarter number (1,2,3,4)
[QuarterName] [char](2) NULL, -- Q1,Q2,Q3,Q4
[QuarterYear] [char](7) NULL, -- Q1 2009,Q2 2009,etc.
[Year] [smallint] NULL, -- Year
[WeekNumOverall] [smallint] NULL, -- key identifier of the week
[WeekName_W] [char](4) NULL,
[ISOWeek] [smallint] NULL,
CONSTRAINT [PK_Dim_Calendar] PRIMARY KEY NONCLUSTERED
(
[DateKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

-- Create function to get the ISO Week
CREATE FUNCTION [dbo].[fn_Get_ISO_Week_Number](

@date DATETIME )

RETURNS INT

AS

BEGIN

/***



The code here is from the article http://www.sqlservercentral.com/articles/Advanced+Querying/onthetrailoftheisoweek/1675/ by Christoffer Hedgate 20 Jan 2005.

Note, change 01-04 in Christoffer's article because dates need to be passed as YYYYMMDD in SQL Server to avoid any inconsistency.



****/



RETURN
( CASE

-- Exception where @date is part of week 52 (or 53) of the previous year

WHEN @date < CASE (DATEPART(dw, CAST(YEAR(@date) AS CHAR(4)) + '0104') + @@DATEFIRST - 1) % 7

WHEN 1 THEN CAST(YEAR(@date) AS CHAR(4)) + '0104'

WHEN 2 THEN DATEADD(d, -1, CAST(YEAR(@date) AS CHAR(4)) + '0104')

WHEN 3 THEN DATEADD(d, -2, CAST(YEAR(@date) AS CHAR(4)) + '0104')

WHEN 4 THEN DATEADD(d, -3, CAST(YEAR(@date) AS CHAR(4)) + '0104')

WHEN 5 THEN DATEADD(d, -4, CAST(YEAR(@date) AS CHAR(4)) + '0104')

WHEN 6 THEN DATEADD(d, -5, CAST(YEAR(@date) AS CHAR(4)) + '0104')

ELSE DATEADD(d, -6, CAST(YEAR(@date) AS CHAR(4)) + '0104')

END

THEN

(DATEDIFF(d,

CASE (DATEPART(dw, CAST(YEAR(@date) - 1 AS CHAR(4)) + '0104') + @@DATEFIRST - 1) % 7

WHEN 1 THEN CAST(YEAR(@date) - 1 AS CHAR(4)) + '0104'

WHEN 2 THEN DATEADD(d, -1, CAST(YEAR(@date) - 1 AS CHAR(4)) + '0104')

WHEN 3 THEN DATEADD(d, -2, CAST(YEAR(@date) - 1 AS CHAR(4)) + '0104')

WHEN 4 THEN DATEADD(d, -3, CAST(YEAR(@date) - 1 AS CHAR(4)) + '0104')

WHEN 5 THEN DATEADD(d, -4, CAST(YEAR(@date) - 1 AS CHAR(4)) + '0104')

WHEN 6 THEN DATEADD(d, -5, CAST(YEAR(@date) - 1 AS CHAR(4)) + '0104')

ELSE DATEADD(d, -6, CAST(YEAR(@date) - 1 AS CHAR(4)) + '0104')

END,

@date

) / 7) + 1



-- Exception where @date is part of week 1 of the following year

WHEN @date >= CASE (DATEPART(dw, CAST(YEAR(@date) + 1 AS CHAR(4)) + '0104') + @@DATEFIRST - 1) % 7

WHEN 1 THEN CAST(YEAR(@date) + 1 AS CHAR(4)) + '0104'

WHEN 2 THEN DATEADD(d, -1, CAST(YEAR(@date) + 1 AS CHAR(4)) + '0104')

WHEN 3 THEN DATEADD(d, -2, CAST(YEAR(@date) + 1 AS CHAR(4)) + '0104')

WHEN 4 THEN DATEADD(d, -3, CAST(YEAR(@date) + 1 AS CHAR(4)) + '0104')

WHEN 5 THEN DATEADD(d, -4, CAST(YEAR(@date) + 1 AS CHAR(4)) + '0104')

WHEN 6 THEN DATEADD(d, -5, CAST(YEAR(@date) + 1 AS CHAR(4)) + '0104')

ELSE DATEADD(d, -6, CAST(YEAR(@date) + 1 AS CHAR(4)) + '0104')

END

THEN 1



ELSE

-- Calculate the ISO week number for all dates that are not part of the exceptions above

(DATEDIFF(d,

CASE (DATEPART(dw, CAST(YEAR(@date) AS CHAR(4)) + '0104') + @@DATEFIRST - 1) % 7

WHEN 1 THEN CAST(YEAR(@date) AS CHAR(4)) + '0104'

WHEN 2 THEN DATEADD(d, -1, CAST(YEAR(@date) AS CHAR(4)) + '0104')

WHEN 3 THEN DATEADD(d, -2, CAST(YEAR(@date) AS CHAR(4)) + '0104')

WHEN 4 THEN DATEADD(d, -3, CAST(YEAR(@date) AS CHAR(4)) + '0104')

WHEN 5 THEN DATEADD(d, -4, CAST(YEAR(@date) AS CHAR(4)) + '0104')

WHEN 6 THEN DATEADD(d, -5, CAST(YEAR(@date) AS CHAR(4)) + '0104')

ELSE DATEADD(d, -6, CAST(YEAR(@date) AS CHAR(4)) + '0104')

END,

@date

) / 7) + 1

END

)

END


-- Create the stored procedure to fill the calendar table
CREATE PROCEDURE [dbo].[Dim_Calendar_Generation]

@StartDate Datetime,
@EndDate Datetime
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;


SET DATEFIRST 1

DECLARE
@DateKey int,
@WeekNumInYear tinyint,
@Month tinyint,
@MonthAbbrev char(3),
@MonthBeginDateKey int,
@Quarter tinyint,
@QuarterName char(2),
@QuarterYear char(7),
@Year smallint,
@WeekNumOverall smallint ,
@WeekName_W char(4),
@ISOWeek smallint

WHILE @StartDate <= @EndDate
BEGIN
SET @DateKey = CASE
WHEN (SELECT COUNT(*) FROM Dim_Calendar) > 0) THEN (SELECT MAX(DateKey) FROM Dim_Calendar) + 1
ELSE 1 -- Dim_calendar is empty
END
SET @WeekNumInYear = DATEPART(week,@StartDate)
SET @Month = MONTH(@StartDate)
SET @MonthAbbrev = SUBSTRING(DATENAME(month,@StartDate),1,3)
SET @MonthBeginDateKey =
CASE
WHEN
(
SELECT COUNT(MonthBeginDateKey)
FROM dbo.Dim_Calendar
WHERE MONTH(@StartDate) = MONTH(FullDate)
AND YEAR(@StartDate) = YEAR(FullDate)
AND DAY(FullDate) = 1
) > 0
THEN
(
SELECT DISTINCT MonthBeginDateKey
FROM dbo.Dim_Calendar
WHERE MONTH(@StartDate) = MONTH(FullDate)
AND YEAR(@StartDate) = YEAR(FullDate)
AND DAY(FullDate) = 1
) -- @StartDate is not the first date in the month
ELSE @DateKey -- @StartDate is the first day of the month
END

SET @Quarter = DATEPART(quarter,@StartDate)
SET @QuarterName = 'Q' + DATENAME(quarter,@StartDate)
SET @QuarterYear = 'Q' + DATENAME(quarter,@StartDate) + ' ' + CAST(YEAR(@StartDate) AS VARCHAR(4))

SET @Year = YEAR(@StartDate)
SET @WeekNumOverall =
CASE
WHEN
(
SELECT COUNT(WeekNumOverall)
FROM Dim_Calendar
WHERE YEAR(@StartDate) = YEAR(FullDate)
AND MONTH(@StartDate) = MONTH(FullDate)
AND @WeekNumInYear = WeekNumInYear
) > 0
THEN
(
SELECT DISTINCT WeekNumOverall
FROM Dim_Calendar
WHERE YEAR(@StartDate) = YEAR(FullDate)
AND MONTH(@StartDate) = MONTH(FullDate)
AND @WeekNumInYear = WeekNumInYear
)
WHEN (SELECT COUNT(*)FROM Dim_Calendar) = 0 THEN 1 -- Dim_Calendar is empty
ELSE (SELECT MAX(WeekNumOverall) FROM Dim_Calendar) + 1 -- First week of the month
END
SET @WeekName =
CASE
WHEN DATEPART(week,@StartDate) < 10
THEN 'W 0'+ CAST(@WeekNumInYear AS VARCHAR(4))
ELSE 'W ' + CAST(@WeekNumInYear AS VARCHAR(4))
END

SET @ISOWeek = (SELECT dbo.fn_Get_ISO_Week_Number(@StartDate))

INSERT INTO Dim_Calendar
(
[DateKey]
,[FullDate]
,[WeekNumInYear]
,[Month]
,[MonthAbbrev]
,[MonthBeginDateKey]
,[Quarter]
,[QuarterName]
,[QuarterYear]
,[Year]
,[WeekNumOverall]
,[WeekName_W]
,[ISOWeek]
)
VALUES
(
@DateKey
,@StartDate
,@WeekNumInYear
,@Month
,@MonthAbbrev
,@MonthBeginDateKey
,@Quarter
,@QuarterName
,@QuarterYear
,@Year
,@WeekNumOverall
,@WeekName_W
,@ISOWeek
)

SET @StartDate = @StartDate + 1
END

END

Wednesday, September 2, 2009

Get the date without the time

PL/SQL
SELECT TRUNC(SYSDATE) --That should get you today's date


T-SQL
SELECT DATEADD(d,DATEDIFF(d,0,GETDATE()),0) --That should get you today's date

SSIS - Kimball SCD Component vs. SCD Component

Some of you certainly know the performance issue of using the Slowly Changing Dimension (SCD) SSIS component.
With a small or medium amount of records, this component works fine but with a bigger amount of data, I’m speaking here about more than 10000 records (which is not huge J), this component “sucks”.

Here at Mobistar I encountered this kind of issue because I had to import more or less 80000 records per day.

By using the SCD component, after 40 minutes I decided to stop the loading and starting to find another solution.
By using the Kimball SCD component, the loading takes a bit more than 2 minutes.

The Kimball SCD component receives input from 2 sorted sources, compares the data and updates, inserts or deletes the records into the destination table. Furthermore, the Kimball SCD applies an insert or update against all rows at the same time. The normal SCD applies the update or the insert against a single row at the time. This is why the normal SCD is slower than the Kimball SCD.

I guess this component could interesting you a lot in your future developments.

Feel free to send me your experience with this component.

http://kimballscd.codeplex.com/

Kind regards,
Jérémie
PS: you can also have a look to http://www.sqlbi.eu/Projects/TableDifference/tabid/74/language/en-US/Default.aspx for information about another improved SCD component