HotDog's Blog

Hotdog (Robert Verpalen) about C# and vb.net

vbCity Blogs moved to:
http://cs.vbcity.com/blogs
  Home :: Syndication  :: Login

MayJune 2008Jul
SMTWTFS
25262728293031
1234567
891011121314
15161718192021
22232425262728
293012345

Articles

Archives

Topics

CONTACT

Fun but useful linkies

General

VS 2005

Wolfenstein ET

Wednesday, June 04, 2008 #

First thing's first: the function :D

Code Copy

create
function Split (
@StringToSplit varchar(2048),
@Separator varchar(128))
returns table as return
with
indices as
(
select 0 S, 1 E
union all

select
E, charindex(@Separator, @StringToSplit, E) + len(@Separator)
from indices
where
E > S
)
select
substring(@StringToSplit,S,
case when E > len(@Separator) then e-s-len(@Separator) else len(@StringToSplit) - s + 1 end) String
,
S StartIndex        
from indices where S >0

 

Yeah, yeah, I know, such a function is an old topic and there are probably numerous solutions available, but I had the impression all were a bit outdated and created before the introduction of the CTE (Common Table Expression) in sql server 2005. Most examples either use a temporary table or return multiple result sets.
Ideal is the use of a function, but using a temporary table in it would prevent certain query optimizations that sql server would normally do for you (besides the small hit of creating a temptable). Granted, these days fast computers don't require that level of optimizing, but still, in heavy workloads every detail might matter ;)  In comes the 'newly' introduced CTE functionality. That might have been around for a couple of years, but I've never really used it until recenlty :$ Whatever the case, it makes a function such as split suddenly a lot shorter.
The function returns both the values (column name 'String' ) and the StartIndex. The Index might not be needed most of the time, but when joining, searching or selecting on the String, the column name is needed anyway, so the StartIndex won't be in the way, and it is there when it is needed. (As a small sidenote, in the case E > len(@Separator is used instead of E > S because the same optimizations that give the speed benefit, crashed the code on the substring() when using criteria on the string)

Some examples:

Code Copy
--simple split comma delimited string example
select
* from Split('aaa,b,cccc,dd,e,ffff,g' , ',')

--when the separator is not found, no problems either

select
* from Split('abcdefgh' , ',')

--empty values are returned too, the list would not be complete without them (for example when handling input file lines)

select
* from Split(',,a,,b,c,,,d' , ',')

--skipping the empty values would simply be using a criterium

select
* from Split(',,a,,b,c,,,d' , ',') where len(String) > 0

--of course, splitting with other (and longer) strings than a comma is supported too

select
* from Split('aaa->bb->cccc->d' , '->') where len(String) > 0

--just a final example with some query options combined

select
rtrim(ltrim(String)) from Split('from abc from abb from cdd from addd' , 'from')
where ltrim(String) like 'a%'
posted @ 1:21 AM | Feedback (4)