Sunday, March 25, 2012

A Word about Meta-data, pass through columns and Derived columns

Here's another one of my bitchfest about stuff which annoy the *** out of me in SSIS (and no such problems in DTS):

Do you ever wonder how easy it was to set up text file to db transform in DTS - I had no problems at all. In SSIS - 1 spent half a day trying to figure out how to get proper column data types for text file - OF Course MS was brilliant enough to add "Suggest Types" feature to text file connection manager - BUT guess what - it sample ONLY 1000 rows - so I tried to change that number to 50000 and clicked ok - BUT ms changed it to 1000 without me noticing it - SO NO WONDER later on some of datatypes did not match. And boy what a fun it is to change the source columns after you have created a few transforms.

This s**hit just breaks... So a word about Derived Columns - pretty useful feature heh? ITs not f***ing useful if it DELETES SOME of the Code itself after there have been changes in dataflow. I cant say how pissed off im about that SSIS went ahead and deleted columns from flow & messed up derived columns just because the lineageIDs dont match.

Meta-data - it would be useful if you could change it and refresh it - im just sick and tired of it that it shows warnings and errors when there's nothing wrong - so after a change i need to doubleclick all my transforms so that those red & yellow boxes would disappear.

Oh and y I passionately dislike Derived columns - so you create new fields based on some data - you do some stuff - combine multiple columns to one, but you have no way saying remove the columns from the pipeline. Y you need it - well if you have 50K + rows with 30+ columns then its EXTRA useless memory overhead for your package.

Hopefully one day I will understand how SSIS works (not an ez task I say) - I might be able to spend more time on development and less time on my bitchfest - UNTIL then --> Another Day - Another Hassle with SSISOh - i also forgot - trying to get to columns to appear in the right order is no picnic either. If you are dealing with 80+ you would like them to always appear on same order (alphabetic or otherwise meaningful) but SSIS seems to shuffle the order also every now and then|||

TheViewMaster wrote:

Here's another one of my bitchfest about stuff which annoy the *** out of me in SSIS (and no such problems in DTS):
Do you ever wonder how easy it was to set up text file to db transform in DTS - I had no problems at all. In SSIS - 1 spent half a day trying to figure out how to get proper column data types for text file - OF Course MS was brilliant enough to add "Suggest Types" feature to text file connection manager - BUT guess what - it sample ONLY 1000 rows - so I tried to change that number to 50000 and clicked ok - BUT ms changed it to 1000 without me noticing it - SO NO WONDER later on some of datatypes did not match. And boy what a fun it is to change the source columns after you have created a few transforms.

If you're going to rely on a computer to "guess" what the column data-types should be then you deserve all that's coming to you. "Suggest Types" is just that. A suggestion. It is your responsibility to make sure the column data-types are correct.

TheViewMaster wrote:


This s**hit just breaks... So a word about Derived Columns - pretty useful feature heh? ITs not f***ing useful if it DELETES SOME of the Code itself after there have been changes in dataflow. I cant say how pissed off im about that SSIS went ahead and deleted columns from flow & messed up derived columns just because the lineageIDs dont match.

It doesn't delete columns itself. Only if you do something to cause that. Lineage IDs are vital to how the pipeline engine works - this behaviour is absolutely correct and there are justifiable reasons for it.

I have never seen it delete code without there being some perfectly good, user-produced, reason.

TheViewMaster wrote:


Meta-data - it would be useful if you could change it and refresh it - im just sick and tired of it that it shows warnings and errors when there's nothing wrong - so after a change i need to doubleclick all my transforms so that those red & yellow boxes would disappear.

You can. If downstream components are still affected then it will flag this to you. What you will probably find is that it won't automatically fix components that are downstream of asynchronous components. That is absolutely correct. What SSIS will not do is make changes without telling you about it - it expects you to make the decisions, it will not make decisions for you. In this regard it is a huge improvement from DTS which would often make guesses about things which could be wrong.

TheViewMaster wrote:


Oh and y I passionately dislike Derived columns - so you create new fields based on some data - you do some stuff - combine multiple columns to one, but you have no way saying remove the columns from the pipeline. Y you need it - well if you have 50K + rows with 30+ columns then its EXTRA useless memory overhead for your package.

Wrong again. Removing columns from memory (and all the subsequent memory management that must occur) at execution time after they are no longer required was considered to be too detrimental an overhead in terms of pipeline performance. The pipeline is there to produce screamingly fast data movement and anything that impacts that (such as removing unused columns) will not be done.

TheViewMaster wrote:


Hopefully one day I will understand how SSIS works (not an ez task I say)

Absolutely correct. Its not easy - its a huge product and there's loads to understand. There is a learning curve with SSIS as there is any other product. If you're not prepared to go through that curve then that's fine - don't use the product. But don't moan about it endlessly.

TheViewMaster wrote:

- I might be able to spend more time on development and less time on my bitchfest - UNTIL then --> Another Day - Another Hassle with SSIS

I don't really know why I'm bothering replying. You obviously have decided that SSIS is useless without even trying to understand the justification for any of these behaviours. I'm biting my tongue till it hurts, believe me.

I hope as you come to learn more about SSIS (if you choose to continue down this route) you will understand some of the justifications for the behaviour that is inherent in the UI. There are definately improvements to be made, I won't disagree with you there, and if you have legitamate requests for enhancements (such as your comment about reordering of columns) then there are proper channels for submitting those requests. Whinging on this forum is not one of them.

-Jamie

|||1000 rows is the scan limit of the "suggest types" feature.

Aren't you prompted to fix invalid metadata when changes upstream are made?

I hear you on being able to drop columns from the dataflow whenever you'd like. That would be a nice feature.

But I have to ask... With your growing list of issues with SSIS, I wonder:
- Have you filled out bug/feature requests?
- Is SSIS the right tool for you? Perhaps something like Informatica would be better suited for you.

Phil|||

TheViewMaster wrote:

Oh - i also forgot - trying to get to columns to appear in the right order is no picnic either. If you are dealing with 80+ you would like them to always appear on same order (alphabetic or otherwise meaningful) but SSIS seems to shuffle the order also every now and then

SSIS indeed does not gaurantee column ordering. SSIS will try to maintain order from the source but there are cases where it won't. For example if you go through an asynchronous component then the ordering is however that component puts it on the output since it is effectively a "new source".

Thanks
Mark

|||

Jamie Thomson wrote:


I don't really know why

I'm bothering replying. You obviously have decided that SSIS is useless

without even trying to understand the justification for any of these

behaviours. I'm biting my tongue till it hurts, believe me.

Please dont stop - I am trying to understand how things work - it's just it can be frustrating spending too much time on stuff - this forum has deffinately being very useful thanks to you and all other responders

Phil Brammer wrote:

1000 rows is the scan limit of the "suggest types" feature.

Aren't you prompted to fix invalid metadata when changes upstream are made?

I hear you on being able to drop columns from the dataflow whenever you'd like. That would be a nice feature.

Yes - i figured it out - thank god I had backup copy of the version of SSIS package somewhere. The issue is that i changed some transformation (instead of merge join - a lookup) and that caused LineageIDs to not match. and in the transform to rawfile - there were columns like CITY_NAME renamed to CityName - and of course SSIS wasnt able to match them by name. So even with finding that piece of tranformation from backup i still had to do a lot of manual work to get all columns back.

- Have you filled out bug/feature requests?


- I tried once & only once - its a hassle to fill it out

- Is SSIS the right tool for you? Perhaps something like Informatica would be better suited for you.


It is "free" - so it must be the right one.
The learning curve is tough on that one - reminds me of moving from vb6/asp to .Net = a lot of hangst and head-banging.
But I start to see the light & still the dev model of SSIS outweighs the hurdles with some parts

No comments:

Post a Comment