First thing's first: the function :D
create function Split
(
@StringToSplit varchar(2048),
@Separator varchar(128))
returns
table as return
with indices
as
(
select 0 S
, 1 E
union all
select E
, charindex(@Separator
, @StringToSplit
, E
) + len(@Separator
)
from indices
where E > S
)
select substring(@StringToSplit
,S
,
case when E > len(@Separator) then e-s-len(@Separator) else len(@StringToSplit) - s + 1 end) String
,S StartIndex
from indices where S >0
Yeah, yeah, I know, such a function is an old topic and there are probably numerous solutions available, but I had the impression all were a bit outdated and created before the introduction of the CTE (Common Table Expression) in sql server 2005. Most examples either use a temporary table or return multiple result sets.
Ideal is the use of a function, but using a temporary table in it would prevent certain query optimizations that sql server would normally do for you (besides the small hit of creating a temptable). Granted, these days fast computers don't require that level of optimizing, but still, in heavy workloads every detail might matter ;) In comes the 'newly' introduced CTE functionality. That might have been around for a couple of years, but I've never really used it until recenlty :$ Whatever the case, it makes a function such as split suddenly a lot shorter.
The function returns both the values (column name 'String' ) and the StartIndex. The Index might not be needed most of the time, but when joining, searching or selecting on the String, the column name is needed anyway, so the StartIndex won't be in the way, and it is there when it is needed. (As a small sidenote, in the case E > len(@Separator is used instead of E > S because the same optimizations that give the speed benefit, crashed the code on the substring() when using criteria on the string)
Some examples:
--simple split comma delimited string example
select * from Split
('aaa,b,cccc,dd,e,ffff,g' , ',')
--when the separator is not found, no problems either
select * from Split
('abcdefgh' , ',')
--empty values are returned too, the list would not be complete without them (for example when handling input file lines)
select * from Split
(',,a,,b,c,,,d' , ',')
--skipping the empty values would simply be using a criterium
select * from Split
(',,a,,b,c,,,d' , ',') where len(String
) > 0
--of course, splitting with other (and longer) strings than a comma is supported too
select * from Split
('aaa->bb->cccc->d' , '->') where len(String
) > 0
--just a final example with some query options combined
select rtrim(ltrim(String
)) from Split
('from abc from abb from cdd from addd' , 'from')
where ltrim(String) like 'a%'